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Preface 


The  good  news  about  computers  is  that  they  do  what  you  tell  them 
to  do.  The  bad  news  is  that  they  do  what  you  tell  them  to  do. 

-  Ted  Nelson. 

Computer  Science  is  a  relatively  young  discipline.  University  Computer 
Science  Departments  are  rarely  more  than  a  few  decades  old.  They  will 
typically  have  emerged  either  from  a  Mathematics  Department  or  an  Engi¬ 
neering  Department,  and  until  recently  a  Computer  Science  degree  was  pre¬ 
dominantly  about  writing  computer  programs  (the  mathematical  software) 
and  building  computers  (the  engineering  hardware).  Textbooks  typically 
referred  to  programming  as  an  “art”  or  a  “craft”  with  little  scientific  basis 
compared  to  traditional  engineering  subjects,  and  many  computer  program¬ 
mers  still  like  to  see  themselves  as  part  of  a  pop  culture  of  geeks  and  hackers 
rather  than  as  academically-trained  professionals. 

However,  the  nature  of  Computer  Science  is  changing  rapidly,  reflecting 
the  increasing  ubiquity  and  importance  of  its  subject  matter.  In  the  last 
decades,  computational  methods  and  tools  have  revolutionised  the  sciences, 
engineering  and  technology.  Computational  concepts  and  techniques  are 
starting  to  influence  the  way  we  think,  reason  and  tackle  problems;  and 
computing  systems  have  become  an  integral  part  of  our  professional,  eco¬ 
nomic  and  social  lives.  The  more  we  depend  on  these  systems  -  particularly 
for  safety-critical  or  economically-critical  applications  -  the  more  we  must 
ensure  that  they  are  safe,  reliable  and  well  designed,  and  the  less  forgiv¬ 
ing  we  can  be  of  failures,  delays  or  inconveniences  caused  by  the  notorious 
“computer  glitch.” 

Unlike  traditional  engineering  disciplines  which  are  solidly  rooted  on 
centuries-old  mathematical  theories,  the  mathematical  foundations  under¬ 
lying  Computer  Science  are  younger,  and  Computer  Scientists  have  yet  to 
agree  on  how  best  to  approach  the  fundamental  concepts  and  tasks  in  the 
design  of  computing  systems.  The  Civil  Engineer  knows  exactly  how  to 
define  and  analyse  a  mathematical  model  of  the  components  of  a  bridge 
design  so  that  it  can  be  relied  on  not  to  fall  down,  and  the  Aeronautical 
Engineer  knows  exactly  how  to  define  and  analyse  a  mathematical  model  of 
an  aeroplane  wing  for  the  same  purpose.  However,  Software  Engineers  have 
few  universally-accepted  mathematical  modelling  tools  at  their  disposal.  In 
the  words  of  the  eminent  Computer  Scientist  Alan  Kay,  “most  undergrad- 
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uate  degrees  in  computer  science  these  days  are  basically  Java  vocational 
training.”  But  computing  systems  can  be  at  least  as  complex  as  bridges  or 
aeroplanes,  and  a  canon  of  mathematical  methods  for  modelling  computing 
systems  is  therefore  very  much  needed.  “Software’s  Chronic  Crisis”  was  the 
title  of  a  popular  and  widely-cited  Scientific  American  article  from  1994, 
and,  unfortunately,  its  message  remains  valid  today. 

University  Computer  Science  Departments  face  a  sociological  challenge 
posed  by  the  fact  that  computers  have  become  everyday,  deceptively  easy- 
to-use  objects.  A  single  generation  ago,  new  Computer  Science  students 
typically  had  teenage  backgrounds  spent  writing  Basic  and/or  Assembly 
Language  programs  for  their  early  hobbyist  computers.  A  passion  for  this 
activity  is  what  drove  these  students  into  University  Computer  Science  pro¬ 
grammes,  and  they  were  not  disappointed  with  the  education  they  received. 
Their  modern-day  successors  on  the  other  hand  -  born  directly  into  the 
heart  of  the  computer  era  -  have  grown  up  with  the  internet,  a  billion  dollar 
computer  games  industry,  and  mobile  phones  with  more  computing  power 
than  the  space  shuttle.  They  often  choose  to  study  Computer  Science  on 
the  basis  of  having  a  passion  for  using  computing  devices  throughout  their 
everyday  lives,  for  everything  from  socialising  with  their  friends  to  down¬ 
loading  the  latest  films,  and  they  often  have  less  regard  than  they  might 
to  the  considerations  of  what  a  University  Computer  Science  programme 
entails,  that  it  is  far  more  than  just  using  computers. 

There  is  a  universal  trend  of  large  numbers  of  first-year  students  trans¬ 
ferring  out  of  Computer  Science  programmes  and  into  related  programmes 
such  as  Media  Studies  or  Information  Studies.  This  trend,  we  feel,  is  often 
unjustified,  and  can  be  reversed  by  a  more  considered  approach  to  modelling 
and  the  mathematical  foundations  of  system  design,  one  which  the  students 
can  connect  and  feel  at  home  with  right  from  the  beginning  of  their  Univer¬ 
sity  education.  This  has  been  our  motivation  in  writing  this  textbook  aimed 
at  teaching  first-year  undergraduate  students  the  essential  mathematics  and 
modelling  techniques  for  computing  systems  in  a  novel  and  relatively  light¬ 
weight  way. 

The  book  is  divided  into  two  parts.  Part  I,  subtitled  Mathematics  for 
Computer  Science,  introduces  concepts  from  Discrete  Mathematics  which 
are  in  the  curriculum  of  any  University  Computer  Science  programme,  as 
well  as  much  which  often  is  not.  This  material  is  typically  taught  in  service 
modules  by  mathematicians,  and  new  Computer  Science  students  often  find 
it  difficult  to  engage  with  the  material  presented  in  a  purely  mathemati¬ 
cal  context.  We  attempt  here  to  present  the  material  in  an  engaging  and 
motivating  fashion  as  the  basis  of  computational  thinking. 

Part  II  of  the  book  -  Modelling  Computing  Systems  -  develops  a  par- 
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ticular  approach  to  modelling  based  on  state  transition  systems.  State  tran¬ 
sition  systems  have  always  featured  in  the  Computer  Science  curriculum, 
but  traditionally  (and  increasingly  historically)  only  within  the  study  of 
formal  languages.  Here  we  introduce  them  as  general  modelling  devices, 
and  explore  languages  and  techniques  for  expressing  and  reasoning  about 
system  specifications  and  (concurrent)  implementations.  Although  Part  I 
covers  twice  as  many  pages  as  Part  II,  the  title  of  the  book  is  nonetheless 
justified:  much  of  the  Mathematics  presented  in  Part  I  itself  is  used  directly 
for  modelling  systems,  and  forms  the  basis  on  which  the  approach  developed 
in  Part  II  is  based. 

The  main  benefit  of  mathematical  formalisation  is  that  systems  can  be 
modelled  and  analysed  in  precise  and  unambiguous  ways;  but  formal  pre¬ 
cision  can  also  be  a  major  pitfall  in  modelling  since  it  can  compromise 
simplicity  and  intuition.  In  this  book,  therefore,  we  always  try  to  start  from 
intuition  and  examples,  and  we  aim  at  developing  precise  concepts  from  that 
basis.  How  and  when  to  be  precise  is  certainly  not  less  important  to  learn 
than  precision  itself:  the  ability  to  give  mathematical  proofs  often  does  not 
depend  on  knowing  precise  formal  definitions  and  foundations.  One  can, 
for  example,  write  down  recursive  functions  without  having  a  precise  formal 
concept  in  mind. 

There  is  a  long  standing  tradition  in  disciplines  like  Physics  to  teach 
modelling  through  little  artifacts.  The  fundamental  ideas  of  computational 
modelling  and  thinking  as  well  can  better  be  learned  from  idealised  exam¬ 
ples  and  exercises  than  from  many  real  world  computer  applications.  This 
book  builds  on  a  large  collection  of  logical  puzzles  and  mathematical  games 
that  require  no  prior  knowledge  about  computers  and  computing  systems; 
these  can  be  much  more  fun  and  sometimes  much  more  challenging  than 
analysing  a  device  driver  or  a  criminal  record  database.  Also,  computa¬ 
tional  modelling  and  thinking  is  about  much  more  than  just  computers! 

In  fact,  games  play  a  far  more  important  role  in  the  book:  they  provide 
a  novel  approach  to  understanding  computer  software  and  systems  which  is 
proving  to  be  very  successful  both  in  theory  and  practice.  When  a  computer 
runs  a  program,  for  example,  it  is  in  a  sense  playing  a  game  against  the 
user  who  is  providing  the  input  to  the  program.  The  program  represents 
a  strategy  which  the  computer  is  using  in  this  game,  and  the  computer 
wins  the  game  if  it  correctly  computes  the  result.  In  this  game,  the  user 
is  the  adversary  of  the  computer  and  is  naturally  trying  to  confound  the 
computer,  which  itself  is  attempting  to  defend  its  claim  that  it  is  computing 
correctly,  that  is,  that  the  program  it  is  running  is  a  winning  strategy.  (In 
Software  Engineering,  this  game  appears  in  the  guise  of  testing.)  Similarly, 
the  controller  of  a  software  system  that  interacts  with  its  environment  plays 
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a  game  against  the  environment:  the  controller  tries  to  maintain  the  system’s 
properties,  while  the  environment  tries  to  confound  them. 

This  view  suggests  an  approach  to  addressing  three  basic  problems  in 
the  design  of  computing  systems: 

1.  Specification  refers  to  the  problem  of  precisely  identifying  the  task  to 
be  solved,  as  well  as  what  exactly  constitutes  a  solution.  This  problem 
corresponds  to  the  problem  of  defining  a  winning  strategy. 

2.  Implementation  or  Synthesis  refers  to  the  problem  of  devising  a 
solution  to  the  task  which  respects  the  specification.  This  problem 
corresponds  to  the  problem  of  implementing  a  winning  strategy. 

3.  Verification  refers  to  the  problem  of  demonstrating  that  the  devised 
solution  does  indeed  respect  the  specification.  This  problem  corre¬ 
sponds  to  the  problem  of  proving  that  a  given  strategy  is  in  fact  a 
winning  strategy. 

This  analogy  between  the  fundamental  concepts  in  Software  Engineering  on 
the  one  hand,  and  games  and  strategies  on  the  other,  provides  a  mode  of 
computational  thinking  which  comes  naturally  to  the  human  mind,  and  can 
be  readily  exploited  to  explain  and  understand  Software  Engineering  con¬ 
cepts  and  their  applications.  It  also  motivates  our  thesis  that  Game  Theory 
provides  a  paradigm  for  understanding  the  nature  of  computation. 

There  are  over  200  exercises  presented  throughout  each  chapter,  all  of 
which  have  complete  solutions  at  the  back  of  the  book,  as  well  as  over  200 
futher  exercises  at  the  end  of  each  chapter  whose  solutions  are  not  provided. 
The  exercises  within  the  chapters  are  often  used  to  explore  subtleties  or 
side-issues,  or  simply  to  put  lengthy  arguments  into  an  appendix,  and  as 
such  should  all  be  attempted;  their  solutions  at  the  back  of  the  book  should 
be  looked  at  as  well,  as  they  often  explain  the  issues  which  the  exercises  are 
attempting  to  highlight. 

Most  of  the  material  in  this  book  has  been  used  successfully  for  over  a 
decade  in  first-year  Discrete  Mathematics  and  Systems  Modelling  modules. 
Countless  eyes  have  passed  over  the  text,  and  a  thousand  students  have 
solved  its  exercises.  Nonetheless  there  will  inevitably  be  a  (hopefully  small) 
flurry  of  errors  in  the  text  for  which  we  accept  full  responsibility  and  offer 
our  sincere  apologies. 
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.  .  .  for  by  the  error  of  some  calculator  the  vessel  often  splits  upon 
a  rock  that  should  have  reached  a  friendly  pier  .  .  . 

-  Henry  David  Thoreau. 

We  all  know  from  personal  experience  that  computers  do  not  work  cor¬ 
rectly  all  the  time.  For  most  of  us  this  realisation  manifests  itself  with 
nothing  more  serious  than  delays  and  frustrations  as  we  encounter  auto¬ 
matic  bank  tellers  which  are  out-of-order  or  Web  sites  which  are  faulty,  or 
face  long  waits  at  airports  as  glitches  in  the  booking,  check-in,  or  even  the 
flight  control  system  are  being  catered  for. 

However,  the  problems  of  systems  failures  become  more  serious  (costly, 
deadly)  as  automatic  control  systems  find  their  way  into  almost  every  aspect 
of  our  daily  lives.  It  is  recognised  -  and  accepted  -  that  complete  reliability 
of  any  major  software  system  is  beyond  expectation.  While,  for  example, 
civil  and  mechanical  engineers  can  build  impressive  bridges  which  are  guar¬ 
anteed  to  remain  standing,  and  aeronautical  engineers  can  design  aeroplane 
wings  which  behave  in  precise  and  predicatable  ways,  Software  Engineers 
are  almost  never  so  successful.  Computers  carry  viruses,  hang,  crash  or  die; 
and  their  software  is  full  of  security  leaks  and  bugs.  Designing  dependable 
high  quality  computing  systems  remains  a  challenge  for  Software  Engineers. 

Quality  expectations,  of  course,  have  much  to  do  with  culture  and  con¬ 
text.  Many  of  us  are  willing  to  accept  as  a  mere  inconvenience  that  a  train 
can  be  delayed,  a  cash  machine  can  be  out  of  order,  or  a  mail  server  can 
temporarily  be  down.  But  we  don’t  tolerate  bridges  that  fall  down,  nuclear 
meltdowns  in  power  stations,  or  security  leaks  in  public  data  bases.  Engi¬ 
neers  speak  about  safety  critical  applications  when  system  failure  cannot 
be  tolerated,  but  we  should  expect  software  systems  to  be  user  friendly, 
safe  and  dependable  in  any  application  context.  Why  is  this  so  difficult  to 
achieve? 

There  are  several  answers  to  this  question.  One  of  them  is  that  software 
systems  can  be  extremely  complex  -  more  complex  even  than  most  other  sys¬ 
tems  that  Engineers  can  design  and  build.  They  may  consist  of  large  num¬ 
bers  of  heterogeneous  components  that  can  change  over  time  and  interact 
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in  intricate  and  sophisticated  ways.  Another  answer  is  that  our  knowledge 
and  experience  in  designing  such  systems,  and  our  expertise  in  organising 
their  design  process,  are  still  in  their  infancy  compared  to  building  bridges, 
chemical  plants  or  railway  networks.  A  third  answer  -  and  perhaps  the 
most  important  one  -  is  that  rigorous  mathematical  tools  and  methods  are 
much  less  employed  in  Software  Engineering  than  in  other  engineering  dis¬ 
ciplines.  While  traditional  engineers  have  always  been  academically  trained 
to  use  tools  and  techniques  from  mathematics  and  physics  to  guarantee  the 
quality  of  their  products,  software  designers  and  programmers  are  still  often 
self-taught  and  have  traditionally  relied  very  much  on  their  intuition  and 
intelligence.  Many  of  them  seem  to  have  a  rather  fatalist  attitude  towards 
bugs. 

This  attitude  is  more  and  more  difficult  to  defend.  Programs  and  soft¬ 
ware  systems  are  themselves  mathematical  structures;  many  software  sys¬ 
tems  rely  on  sophisticated  mathematical  mechanisms  such  as  audio  com¬ 
pression,  public  key  cryptography,  or  Web  search  ranking;  and  there  are 
powerful  domain-specific  mathematical  tools  and  techniques  that  can  help 
us  to  understand,  design,  implement  and  analyse  them  in  a  better,  more 
structured,  and  more  scientific  way. 

The  aim  of  this  book  is  to  introduce  some  of  the  techniques  which,  when 
applied,  can  help  to  reduce  the  number  of  errors  present  in  a  system.  Er¬ 
rors  can  arise  at  many  points  in  the  software  development  process,  from 
understanding  exactly  the  requirements  and  behaviour  of  the  system  being 
built,  to  ensuring  that  these  requirements  are  correctly  captured  in  the  de¬ 
sign  and  implementation  of  the  system.  By  working  within  the  confines  of  a 
precise  structured  method,  the  occurrence  of  such  errors  can  be  drastically 
curtailed. 


(0.1)  Examples  of  System  Failures 

To  understand  and  appreciate  the  role  of  mathematics  in  modelling  comput¬ 
ing  systems,  it  is  helpful  to  look  at  a  variety  of  examples  of  system  failure. 
Some  of  these  failures  are  of  an  entirely  technological  nature,  others  have  to 
do  with  the  ways  in  which  humans  and  machines  interact  or  in  which  rules 
of  communication  between  different  agents  have  been  designed.  In  every 
case,  they  arise  from  errors  in  information  processing,  which  is  at  the  very 
core  of  computational  modelling. 

0.1.1  Clayton  Tunnel  Accident 

Up  until  the  mid-nineteenth  century,  collisions  between  trains  were  avoided 
solely  by  enforcing  a  minimum  time  interval  between  trains.  Railway  em- 
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ployees  (known  as  “policemen”)  would  stand  at  regular  intervals  (“blocks”) 
along  the  line  and  signal  trains  with  hand  gestures  to  slow  down  if  too  lit¬ 
tle  time  had  elapsed  since  the  previous  train  had  passed.  In  the  case  of  a 
break-down  of  a  train,  the  guard  in  the  rear  of  the  train  would  run  back 
along  the  track  to  warn  any  oncoming  trains  of  the  danger  ahead. 

With  ever-increasing  traffic,  growing  lapses  in  this  system  eventually 
led  to  the  installation  of  crude  block  signalling  in  particularly  troublesome 
places,  in  which  some  protocol  would  be  followed  to  ensure  that  only  one 
train  occupied  a  given  stretch  of  track  at  any  given  moment.  Such  a  proto¬ 
col  typically  involved  railway  workers  at  each  end  of  the  section  signalling 
each  other  via  telegraph  of  the  passage  of  trains:  having  let  one  train  pro¬ 
ceed,  the  signalman  would  hold  back  any  further  trains  until  a  message  was 
received  indicating  that  the  first  train  had  cleared  the  section  ahead.  The 
first  commercial  electric  telegraph  was  constructed  in  Britain  for  use  on 
the  Great  Western  Railway,  and  the  first  section  of  track  to  be  protected 
by  block  signalling  using  telegraph  communication  was  the  track  through 
Clayton  tunnel  outside  Brighton.  However,  on  the  morning  of  Sunday  25 
August  1861,  this  protocol  failed  to  prevent  a  catastrophic  collision  inside 
the  tunnel  which  killed  23  people  and  injured  a  further  176. 

In  normal  operation,  when  a  train  arrived  at  Clayton  tunnel,  it  would 
meet  a  rail-side  signal  which  would  be  set  at  “danger”  unless  the  signalman 
at  the  entrance  to  the  tunnel  set  it  to  “all  right”  authorising  the  train  to  enter 
the  tunnel.  This  signalman  would  telegraph  a  “train  in  tunnel”  message  to 
the  signalman  at  the  other  end  of  the  tunnel,  and  the  rail-side  signal  would 
be  reset  to  “danger”  to  prevent  any  further  trains  from  entering  the  tunnel 
until  the  signalman  received  a  “tunnel  clear”  message  by  telegraph  from 
the  signalman  at  the  other  end  of  the  tunnel,  indicating  that  the  train  had 
emerged  from  the  other  side. 

On  the  fateful  morning  in  question,  three  trains  left  Brighton  Station 
within  a  seven-minute  period  and  steamed  towards  Clayton  tunnel.  These 
trains  were  scheduled  to  depart  at  8:05,  8:15  and  8:30,  respectively;  however, 
the  first  train  was  running  late,  and  the  assistant  stationmaster  in  charge 
that  morning  -  one  Charles  Legg  -  opted  to  ignore  the  strict  regulation  of 
ensuring  a  minimum  five-minute  separation  between  trains  by  sending  them 
off  at  8:28,  8:31  and  8:35,  respectively.  The  first  train  was  given  the  “all 
right”  signal  to  enter  the  tunnel,  and  the  signalman  -  named  Henry  Killick 
-  telegraphed  the  “train  in  tunnel”  message  to  his  counterpart  -  a  man  by 
the  name  of  Brown  -  at  the  other  end.  He  was  then  taken  by  surprise  by  the 
quick  arrival  of  the  second  train,  which  passed  the  rail-side  signal  before  he 
had  had  a  chance  to  reset  it  to  “danger.”  In  desperation,  he  rushed  out  of 
his  cabin  furiously  waving  his  red  flag  to  stop  the  second  train  just  as  it  was 
disappearing  into  the  tunnel;  there  was  no  way  for  him  to  know,  however, 
whether  or  not  the  driver  had  seen  the  flag. 
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Killick  telegraphed  to  Brown  a  further  “train  in  tunnel"  message  and 
waited  tentatively  for  a  response.  Killick  telegraphed  a  further  message  to 
Brown  asking  if  the  tunnel  was  clear;  and  to  his  relief  he  finally  received  the 
“tunnel  clear”  message  from  Brown.  Unfortunately  for  Killick,  Brown  had 
not  realised  from  Killick’s  repeated  “train  in  tunnel”  message  that  a  second 
train  had  entered  the  tunnel;  his  “tunnel  clear”  message  was  in  response  to 
the  passing  of  the  first  train.  The  driver  of  the  second  train  -  unbeknownst 
to  Killick  -  had  in  fact  seen  the  red  flag;  and  having  finally  brought  his  heavy 
load  to  a  stop,  he  was  in  the  process  of  cautiously  reversing  back  towards 
Killick.  When,  at  that  moment,  the  third  train  arrived  at  the  entrance  to 
the  tunnel,  Killick  offered  it  the  “all  right”  signal  -  with  fatal  consequences. 

The  Clayton  tunnel  accident  is  obviously  not  the  result  of  a  computer 
failure,  but  it  is  based  on  a  poorly  designed  communication  protocol  between 
distributed  agents,  and  therefore  typical  for  computing  systems.  Mutual  ex¬ 
clusion  algorithms,  which  prevent  more  than  one  computing  agent  at  a  time 
accessing  a  resource  such  as  a  printer  or  a  global  variable,  are  instrumental 
parts  of  any  operating  system.  The  accident  also  shows  a  standard  pitfall  of 
systems  design:  the  whole  idea  of  the  signalling  protocol  at  Clayton  tunnel 
was  to  ensure  that  two  trains  could  not  occupy  the  same  block  at  the  same 
time.  But  it  couldn’t  handle  the  exceptional  case  it  was  supposed  to  pre¬ 
vent.  (For  details,  see  L.T.C.  Rolt,  Red  for  Danger:  The  Classic  History 
of  Railway  Disasters,  The  History  Press,  2009.) 

0.1.2  USS  Scorpion 

In  1968,  the  nuclear  submarine  USS  Scorpion  was  destroyed  killing  all  of 
its  99  crew  members.  Though  the  cause  of  its  destruction  has  long  been 
steeped  in  mystery,  evidence  which  was  only  declassified  three  decades  later 
suggests  that  the  submarine  may  in  fact  have  been  destroyed  by  one  of  its 
own  torpedoes  which  had  been  accidentally  activated  and  thus  ejected.  The 
torpedo  had  been  cleverly  designed  to  seek  out  its  nearest  target,  which 
is  precisely  what  it  did  on  this  occasion,  with  devastating  consequences. 
(For  details,  see  P  G  Neumann,  Computer  Related  Risks,  Addison  Wesley, 
1994.) 

The  negative  implications  of  seemingly  sensible  and  harmless  design  deci¬ 
sions  often  arise  only  in  hindsight  as  unintended  consequences  after  disaster 
has  struck.  Clearly,  every  eventuality  needs  to  be  accounted  for,  especially 
in  safety-critical  designs  where  failure  of  the  system  could  lead  to  injury, 
illness  or  loss  of  life;  serious  environmental  damage;  or  major  financial  loss. 

0.1.3  Therac  25  Radiotherapy  Machine 

The  Therac  25  was  a  radiation  therapy  machine  that  intermittently  gave 
the  wrong  radiation  doses  over  a  period  of  three  years  (1985-87)  due  pre- 
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dominantly  to  errors  in  the  software  controlling  its  operation,  as  well  as  its 
poor  interface  design.  The  problems  with  the  Therac  25  have  been  very 
thoroughly  analysed,  and  six  accidents  -  three  of  them  fatal  -  have  been 
attributed  to  its  failures.  (For  details,  see  N  Leveson  and  C  S  Turner,  “An 
Investigation  of  the  Therac-25  Accidents,”  IEEE  Computer  26(7),  pages 
18-41,  July  1993.) 

The  basic  issue  involved  the  replacement  of  hardware  interlocks  used  in 
previous  models  by  a  software-only  system.  The  machine  had  two  modes 
of  operation:  electron  mode  and  photon  (or  X-ray)  mode,  which  were  used 
for  treating  tumours  at  different  depths  in  a  patient’s  body.  Electron  mode 
involved  a  low-power  electron  beam,  while  photon  (X-ray)  mode  involved  a 
high  power  electron  beam  (three  orders  of  magnitude  more  powerful),  but 
with  a  metal  plate  between  the  device  and  the  patient,  to  generate  the  X- 
rays.  The  electron  beam  had  to  be  in  low-power  mode  if  the  plate  was 
not  present,  and  in  earlier  designs  (Therac  6  and  Therac  20)  there  was  a 
mechanical  interlocking  device  which  physically  ensured  this.  This  hardware 
interlock  was  removed  from  the  Therac  25  which  was  left  to  rely  on  a  (faulty) 
software  interlock. 

The  software  was  poorly  specified  (there  was  no  documentation  on  its 
software  specification),  designed  and  tested;  and  much  of  it  was  imported  as- 
is  from  the  previous  models  despite  changes  in  requirements,  without  any 
form  of  integration  testing.  The  problem  was  compounded  by  a  complex 
user  interface.  In  some  cases,  if  the  operator  tried  to  enter  certain  control 
sequences  (either  in  error  or  as  shortcuts),  the  machine  would  operate  in¬ 
correctly,  using  the  high-power  beam  with  no  plate.  It  would  then  report 
an  error,  which  it  would  normally  do  when  no  treatment  had  been  deliv¬ 
ered.  Often  in  response  to  such  an  error  report,  operators  would  repeated 
the  whole  process,  leading  ultimately  to  fatal  unintended  consequences. 

Leveson  and  Turner  draw  the  following  conclusion:  “Virtually  all  com¬ 
plex  software  will  behave  in  an  unexpected  or  undesired  fashion  under  some 
conditions  -  there  will  always  be  another  bug.  Accidents  are  seldom  simple  - 
they  usually  involve  a  complex  web  of  interacting  events  with  multiple  con¬ 
tributing  technical,  human,  and  organisational  factors.”  To  improve  the  sit¬ 
uation,  they  appeal  to  education:  “Taking  a  couple  of  programming  courses 
or  programming  a  home  computer  does  not  qualify  anyone  to  produce  safety- 
critical  software.”  The  lesson  is  clear:  the  same  rigorous  standards  should 
be  applied  to  Software  Engineering  as  to  Engineering  in  general. 

0.1.4  London  Ambulance  Service 

In  October  1992,  the  London  Ambulance  Service  installed  a  computer  aided 
dispatch  (CAD)  system,  known  as  LASCAD,  to  control  the  dispatching  of 
ambulances  across  London.  It  was  to  automatically  match  up  each  call  to 
be  responded  to  with  the  closest  available  ambulance.  However,  the  system 
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was  unable  to  cope  with  real-time  data,  which  was  on  the  order  of  5000 
calls  per  day.  As  it  became  more  and  more  swamped  with  information  and 
requests,  it  generated  more  and  more  exception  messages  requiring  human 
intervention.  The  volume  of  these  messages  caused  the  exception  messages, 
together  with  information  needing  to  be  dispatched  to  ambulances,  to  scroll 
off  the  top  of  the  controllers’  screens.  As  many  as  thirty  deaths  have  been 
attributed  to  failings  of  the  system.  For  example,  it  was  reported  that  one 
ambulance  arrived  to  find  the  patient  had  died  and  long  since  been  collected 
by  the  undertaker;  and  that  another  ambulance  took  11  hours  to  reach  its 
destination  -  five  hours  after  the  stroke  victim  had  made  their  own  way  to 
the  hospital. 

The  London  Ambulance  Service  quickly  reverted  partially  to  its  man¬ 
ual  dispatching  system.  However,  after  eight  days,  the  automated  system 
crashed  completely,  leading  the  service  to  revert  completely  to  the  origi¬ 
nal  manual  system.  Taking  responsibility  for  the  £1.5  million  failure,  the 
chief  executive  of  the  London  Ambulance  Service  duly  resigned  from  his 
post.  (For  details,  see  A  Finkelstein  and  J  Dowell,  “A  Comedy  of  Errors: 
The  London  Ambulance  Service  Case  Study,”  in  The  Eighth  International 
Workshop  on  Software  Specification  and  Design,  IEEE  CS  Press,  pages 
2-4,  1996.) 

As  in  our  previous  examples,  this  disaster  was  caused  by  a  complex  web 
of  managerial  and  economic  pressure,  incompetence  and  technical  failure; 
but  Finkelstein  and  Dowell  conclude  that  “at  the  heart  of  the  failure  are 
breakdowns  in  specification  and  design  common  to  many  software  develop¬ 
ment  projects.” 

0.1.5  Intel  Pentium 

When  the  Intel  Pentium  PC  was  initially  released  in  1994,  problems  were 
found  in  its  floating-point  unit.  With  certain  inputs,  the  unit  gave  inaccurate 
results  when  performing  division,  thus  rendering  it  useless  for  mathematical 
or  scientific  work. 

The  error  had  been  caused  in  the  design  stage  of  the  chip  when  a  new  al¬ 
gorithm  for  floating-point  division  was  implemented  which  was  three  to  five 
times  faster  than  previous  methods.  This  algorithm  is  based  on  using  look¬ 
up  tables  to  calculate  intermediate  results.  The  hardware  was  implemented 
using  a  program  to  download  values  into  the  look-up  tables;  however,  an 
error  in  this  software  caused  five  of  the  1066  entries  to  be  inadvertently 
omitted. 

Because  the  calculations  recursively  use  information  from  the  look-up 
tables,  the  errors  that  can  accrue  magnify  in  scale.  For  example,  performing 
the  sum  x  —  (x/y)  *  y  should  return  the  answer  0  for  any  inputs  x  and  y. 
Given  that  computers  have  to  deal  with  approximations  to  real  numbers, 
we  typically  have  to  settle  for  a  value  close  to  zero  to  be  returned.  But 
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with  input  values  x  =  4195835  and  y  =  3145727  the  first  Pentium  release 
gave  the  answer  256.  (For  details,  see  T  R  Halfhill,  “The  Truth  Behind  the 
Pentium  Bug,”  Byte  20(3),  pages  163-164,  March  1995.) 

0.1.6  Ariane  5 

In  June  1996,  the  maiden  flight  of  the  Ariane  5  satellite  launch  rocket, 
Flight  501,  ended  in  disaster:  the  rocket  veered  off  course  and  exploded  40 
seconds  after  lift-off.  Its  self-destruct  system  was  initiated  when  the  rocket 
detected  it  was  disintegrating.  This  damage  was  caused  by  friction  with  the 
atmosphere  as  the  rocket  was  travelling  at  too  shallow  an  angle. 

The  flight  path  of  the  rocket  was  controlled  by  two  software  components, 
one  providing  the  flight  data,  and  the  other  converting  this  data  into  signals 
which  controlled  nozzles  that  direct  the  rocket’s  boosters.  The  problem  was 
found  to  be  with  the  software  providing  the  flight  data,  which  was  imported 
as-is  from  the  earlier  Ariane  4  (a  similar  problem  underlying  the  Therac  25 
failure) . 

The  software  executed  an  instruction  to  convert  a  64-bit  integer  to  a 
16-bit  representation  on  a  number  that  was  too  big  to  be  stored  as  a  16-bit 
integer.  (Ariane  5  used  a  different  flight  path  from  Ariane  4  which  involved 
a  shorter  period  of  vertical  ascent  before  yawing  over  to  accelerate,  thus 
reaching  shallower  angles  than  Ariane  4  sooner  in  the  flight;  this  problem 
thus  never  arose  with  Ariane  4.)  As  there  was  no  code  to  deal  with  this 
exception,  the  program  crashed,  and  the  ensuing  error  messages  generated 
by  the  system  were  interpreted  by  the  guidance  system  as  flight  data.  Iron¬ 
ically,  the  part  of  the  software  that  failed  was  only  needed  by  Ariane  4 
before  lift-off,  and  was  only  active  during  the  first  part  of  the  flight  due  to 
the  possibility  of  a  short  hold  prior  to  lift-off.  This  piece  of  software  was 
unnecessary  for  Ariane  5. 

The  Ariane  501  Inquiry  Board  reported  that  the  failure  was  “due  to  spec¬ 
ification  and  design  errors  in  the  software  of  the  inertial  reference  system” 
because  the  Ariane  5  Development  Programme  “did  not  include  adequate 
analysis  and  testing  of  the  inertial  reference  system  or  of  the  complete  flight 
control  system.”  It  recommended  that  the  European  Space  Agency  should 
in  the  future  ascertain  that  “specification,  verification  and  testing  are  of  con¬ 
sistently  high  quality.”  (For  details  see  “Ariane  501  Inquiry  Board  report,” 
http://esamultimedia.esa. int/docs/esa-x-1819eng.pdf .) 

0.1.7  Needham-Schroeder  Protocol 

When  communicating  over  the  Internets,  where  anyone  can  intercept  and 
read  the  messages  you  send,  it  is  important  to  securely  encrypt  any  sensitive 
information  that  you  may  send  out,  such  as  your  credit  card  details,  so  that 
only  the  intended  recipient  of  your  message  can  decrypt  and  read  it.  The 
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Needham- Schroeder  protocol  was  devised  to  allow  two  parties  -  commonly 
referred  to  as  “Alice”  and  “Bob”  -  to  authenticate  themselves  over  such  an 
insecure  channel:  after  executing  such  a  protocol,  Alice  will  believe  that 
she  is  talking  to  Bob  and  vice  versa,  and  hence  they  will  have  established 
mutual  trust  for  further  transactions. 

The  Needham-Schroeder  protocol  is  based  on  public-key  cryptography : 
Bob  (and  anyone  else)  can,  for  instance,  use  Alice’s  public  key  -  which 
he  can  obtain  from  some  trusted  server  -  for  encrypting  messages  to  Alice 
which  only  Alice  can  decrypt  and  read  using  her  private  key  which  she  keeps 
secret.  The  protocol  then  works  as  follows: 

1.  Alice  sends  a  message  to  Bob  -  encrypted  with  his  public  key  -  consist¬ 
ing  of  a  random  number  along  with  some  statement  about  her  identity. 

2.  Bob  decrypts  this  message  with  his  private  key,  and  sends  a  message 
in  response  to  Alice  -  encrypted  with  her  public  key  -  consisting  of 
Alice’s  random  number  along  with  a  random  number  of  his  own. 

3.  Alice  decrypts  Bob’s  message  with  her  private  key,  and  sends  another 
message  to  Bob  -  again  encrypted  with  his  public  key  -  consisting  of 
Bob’s  random  number. 

When  Alice  receives  Bob’s  response  to  her  first  message,  she  will  believe  that 
she  is  talking  to  Bob,  as  only  Bob  could  have  decrypted  her  message  and 
discovered  the  random  number  that  she  had  sent  him.  Equally,  when  Bob 
receives  Alice’s  second  message,  he  will  believe  that  he  is  talking  to  Alice, 
as  only  she  could  have  decrypted  his  message  and  discovered  the  random 
number  that  he  had  sent  her.  Hence,  after  executing  this  protocol,  Alice 
and  Bob  will  both  have  reason  to  trust  each  other’s  identities. 

This  protocol  was  devised  in  1978,  and  for  over  15  years  it  gave  no  cause 
for  concern  to  the  network  community.  Indeed,  there  were  a  variety  of 
“proofs”  attesting  to  the  correctness  and  reliability  of  this  protocol.  Despite 
this  evidence  of  the  protocol’s  security,  in  1995  it  was  discovered  to  be 
susceptible  to  a  very  basic  man-m-the-middle  attack :  an  intruder  could 
participate  in  the  protocol  and  convincingly  impersonate  another  agent  - 
even  without  breaking  the  encryption.  Here  is  how  it  works: 

1.  The  intruder  masquerades  as  Bob  so  that  Alice  encrypts  her  initial 
message  with  the  intruder’s  public  key  and  sends  her  message  to  him. 

2.  The  intruder  decrypts  Alice’s  message  with  his  private  key,  then  en¬ 
crypts  it  with  Bob’s  public  key  and  sends  this  on  to  Bob. 

3.  Bob  sends  Alice’s  random  number  together  with  his  own,  encrypted 
with  Alice’s  public  key,  to  the  intruder,  who  forwards  it  -  unaltered  - 
to  Alice. 

4.  Alice  decrypts  Bob’s  message,  encrypts  Bob’s  random  number  with 
the  intruder’s  public  key  and  sends  it  to  the  intruder. 
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5.  The  intruder  decrypts  this  message,  encrypts  it  with  Bob’s  public  key 
and  sends  this  on  to  Bob. 


As  far  as  Alice  and  Bob  are  concerned,  the  results  of  this  interaction  as 
interfered  with  by  the  intruder  appear  identical  to  those  of  the  original 
interaction,  so  they  will  once  again  believe  -  this  time  incorrectly  -  that 
they  are  talking  directly  to  each  other.  Their  subsequent  correspondence 
will  all  be  via  the  intruder,  who  will  be  able  to  read  all  of  Alice’s  messages, 
as  they  will  continue  to  be  encrypted  using  the  intruder’s  public  key  and 
re-encrypted  by  the  intruder  with  Bob’s  public  key  before  being  sent  on  to 
Bob.  The  intruder  will  still  not  be  able  to  read  Bob’s  messages,  though, 
as  these  will  all  be  encoded  using  Alice’s  public  key,  and  the  intruder  will 
only  be  able  to  forward  these  unaltered  to  Alice.  (For  details,  see  G  Lowe, 
“An  attack  on  the  Needham-Schroeder  public  key  authentication  protocol,” 
Information  Processing  Letters  56(3),  pages  131-136,  November  1995.) 

In  contrast  to  the  previous  examples,  this  is  a  pure  design  error,  which  is 
again  rather  unexpected  and  surprising  given  the  simplicity  and  stringency 
of  the  protocol.  The  difficulty  with  detecting  this  flaw  is  that  intruders  can 
behave  in  various  unexpected  ways  that  -  being  unpredictable  -  are  very 
difficult  to  analyse.  Even  very  simple  protocols  can  lead  to  a  wide  variety 
of  different  system  behaviours  that  need  to  be  considered.  It  seems  rather 
improbable  that  such  diversity  can  be  catered  for  simply  by  testing. 

The  various  failures  discussed  above  have  complex  and  generally  multiple 
causes,  and  most  of  them  can  be  traced  back  to  poor  software  development 
processes.  What  is  lacking  in  the  development  process  is  a  rigorous  engi¬ 
neering  discipline  through  which  a  thorough  understanding  of  the  system 
being  developed  is  obtained  before  the  system  is  constructed.  In  traditional 
engineering  disciplines,  the  methods  for  obtaining  such  an  understanding  are 
well  established  and  based  on  formally  modelling  an  appropriately-abstract 
version  of  the  system  being  developed.  The  challenge  for  Software  Engineer¬ 
ing  is  to  mimic  these  methods;  to  do  so  requires  an  understanding  of  how 
to  describe  and  analyse  abstract  models  of  software  systems.  Of  course,  this 
first  requires  an  understanding  of  these  terms. 
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The  notions  of  “system,”  “model,”  “abstraction”  and  “notation”  are  essen¬ 
tial  to  this  book.  In  this  section  we  provide  various  dictionary-style  defini¬ 
tions  of  these  concepts,  interspersed  with  some  examples  and  thoughts. 
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System 

An  assemblage  of  objects  arranged  in  a  regular  subordination,  or  after 
some  distinct  method,  usually  logical  or  scientific;  a  complete  whole 
of  objects  related  by  some  common  law,  principle,  or  end;  a  complete 
exhibition  of  essential  principles  or  facts,  arranged  in  a  rational  depen¬ 
dence  or  connection;  a  regular  union  of  principles  or  parts  forming  one 
entire  thing. 

How  do  we  understand  systems  and  put  them  together?  In  the  object  ori¬ 
ented  approach  to  software  design,  one  is  guided  by  the  above  dictionary 
definition  and  methodically  describes  the  whole  by  giving  descriptions  of  the 
constituent  parts  along  with  how  these  parts  are  put  together.  If  you  have 
tried  and  trusted  building  blocks,  then  you  can  reliably  use  them  again. 

To  understand  and  analyse  the  world  in  terms  of  systems  is  very  impor¬ 
tant  to  science;  and  to  build  them  is  the  fundamental  task  of  Engineers. 
Systems  can  be  described,  for  instance,  in  terms  of  their  structure  -  how 
they  can  be  decomposed  into  parts  and  how  these  parts  are  related  to  each 
other;  or  in  terms  of  their  behaviour  -  how  they  evolve  and  interact  with 
their  environment;  or  in  terms  of  their  functionality  -  what  their  goals 
and  objectives  are.  Systems  are  often  contrasted  with  the  environments  in 
which  they  are  embedded  and  with  which  they  interact.  A  prime  example 
from  the  world  of  computers  is  the  operating  system,  which  manages  our 
interactions  with  the  computer  hardware. 

This  book  addresses  computing  systems.  However,  we  understand  the 
term  “computing”  in  a  rather  loose  sense.  We  do  not  identify  computing 
systems  with  computers,  but  with  all  kinds  of  systems  that  access,  store, 
process  and  communicate  information.  Many  biological,  physical,  econom¬ 
ical  and  social  systems  have  recently  been  studied  from  this  point  of  view, 
and  many  of  the  concepts  and  techniques  introduced  in  this  book  can  be 
used  in  these  contexts. 

Model 

(1)  A  miniature  representation  of  a  thing,  with  the  several  parts  in 
due  proportion;  sometimes,  a  facsimile  of  the  same  size.  (2)  Some¬ 
thing  intended  to  serve,  or  that  may  serve,  as  a  pattern  of  something 
to  be  made;  a  material  representation  of  or  embodiment  of  an  ideal; 
sometimes  a  drawing  or  a  plan;  a  description  of  observed  behaviour, 
simplified  by  ignoring  certain  details. 

Building  models  is  at  the  core  of  any  scientific  and  engineering  discipline. 
Scientists  need  models  to  interpret  their  data  and  make  predictions;  and 
traditional  engineering  products  such  as  bridges  and  aeroplanes  are  never 
built  until  models  of  them  have  been  developed  and  studied  to  understand 
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the  characteristics  of  the  product.  These  models  may  be  small-scale  versions 
of  the  product  which  are  tested  in  wind  tunnels;  or  they  may  be  purely  ab¬ 
stract  models  described  on  paper  using  some  formal  notation  which  are  then 
analysed  more  formally,  for  instance  through  simulations  on  a  computer. 

Modelling  techniques  are  also  becoming  more  and  more  important  in 
software  engineering,  as  computing  systems  become  ever  more  complex  and 
ubiquitous,  and  their  proper  functioning  is  often  extremely  critical.  It  is 
no  longer  possible  to  rely  on  the  cleverness  of  our  programming  skills  when 
building  computing  systems.  In  this  book  we  shall  explore  basic  modelling 
techniques  for  software  engineering;  consider  various  simple  illustrative  yet 
sufficiently  interesting  computing  systems;  and  describe  models  that  capture 
those  aspects  of  their  behaviour  that  interest  us. 

Models  come  in  all  shapes  and  sizes,  and  are  designed  to  capture  specific 
aspects  of  the  thing  they  represent.  Consider,  for  example,  the  following 
two  uses  of  simple  railway  models. 

•  If  we  are  interested  in  teaching  the  history  of  the  development  of  rail¬ 
way  locomotives,  then  full  scale  working  replicas  would  be  fun,  but 
probably  inconvenient;  small  scale  working  replicas  might  do,  or  even 
non-working  replicas.  Meaning  (1)  is  appropriate. 

•  If  we  are  interested  in  developing  strategies  for  safe  shunting,  then  a 
child’s  train  set  might  do.  But  we  could  also  make  do  with  a  paper 
and  pencil  model  with  a  sketch  of  the  track  and  buttons  representing 
the  engines  and  rolling  stock;  or  a  computer  model  with  a  graphical 
interface  and  a  simulator  might  even  be  more  useful.  Meaning  (2)  is 
appropriate. 

Note  that  models  allow  complex  systems  to  be  understood,  and  their  be¬ 
haviour  predicted,  only  within  the  scope  of  the  model;  they  may  give  in¬ 
correct  descriptions  and  predictions  for  situations  outside  the  scope  of  their 
intended  use.  For  example,  a  toy  train  set  would  not  be  much  use  if  we 
were  interested  in  the  stresses  and  strains  induced  in  real  rolling  stock  when 
shunting.  Building  good  models  not  only  requires  formal  training,  but  also 
a  lot  of  experience,  and  a  critical  mind. 

Abstraction 

The  act  or  process  of  leaving  out  of  consideration  one  or  more  proper¬ 
ties  of  a  complex  object  so  as  to  attend  to  others. 

Abstraction  is  an  important  part  of  model  building:  identifying  those  fea¬ 
tures  that  are  essential  for  inclusion  in  the  model  and  separating  out  those 
features  that  can  be  neglected  since  the  essential  elements  do  not  rely  on 
their  presence. 
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As  an  example  of  an  abstraction,  OS  (Ordnance  Survey)  maps  are  used 
by  walkers  in  Britain  who  usually  want  to  know  where  they  are,  where  they 
are  going  (how  far,  which  direction),  and  how  long  it  will  take.  OS  maps 
are  to  scale  (typically  4  cm  to  1  km),  and  include  (easting,  northing)  grid 
reference  pairs  allowing  the  user  to  pinpoint  locations  very  accurately.  For 
example,  Perriswood  near  Reynoldston  has  the  grid  reference  (502,  888)  on 
the  Gower  map  (OS  Explorer  Map  164).  By  eye  or  by  laying  a  piece  of 
string  along  a  proposed  route,  experienced  walkers  can  estimate  its  length 
fairly  accurately,  and  then  use  a  simple  formula  such  as  Naismith’s  Rule 
of  5  kilometres  per  hour  plus  10  minutes  for  each  100  metres  of  uphill  to 
estimate  their  walking  time.  To  be  useful  to  walkers,  OS  maps  are  portable 
(they  fold  flat),  are  to  scale,  and  use  contours  and  shading  to  show  heights 
and  indicate  steepness. 

The  London  Underground  train  map  and  A  to  Z  street  atlas  have  dif¬ 
ferent  formats  from  OS  maps,  and  from  each  other,  as  they  serve  different 
purposes.  The  first  A  to  Z  street  atlas  was  designed  in  1936  by  Phyllis 
Pearsall,  a  portrait  artist,  due  to  her  frustration  at  getting  lost  during  her 
walks  through  London  while  trying  to  follow  an  OS  map.  The  Underground 
train  map,  on  the  other  hand,  would  be  of  very  little  use  to  a  walker.  It  was 
originally  designed  in  1931  by  Harry  Beck,  a  draughtsman  educated  in  elec¬ 
tronics,  and  is  reminiscent  of  an  electronic  circuit  board  diagram  with  only 
vertical,  horizontal  and  45-degree  lines.  Train  stations  are  not  depicted  ge¬ 
ographically  accurately;  the  connections  between  stations  are  accurate,  but 
the  stations  and  routes  of  the  trains  are  distorted  to  provide  an  aesthetically- 
pleasing  image.  As  such,  it  provides  an  ideal  model  for  using  the  Tube,  when 
you  don’t  need  to  know  where  you  are  geographically  as  you  would  if  you 
were  walking,  but  rather  are  only  interested  in  where  to  get  on,  where  to 
change  lines,  and  where  to  get  off.  By  distorting  the  geography,  in  particular 
by  pulling  in  very  remote  stations  located  at  the  ends  of  lines,  a  balanced 
and  concise  diagram  results  which  is  easy  to  use  and  pleasant  to  look  at. 

Notation 

Any  particular  system  of  characters,  symbols,  or  abbreviated  expres¬ 
sions  used  in  art,  or  in  science,  to  express  briefly  technical  facts,  quan¬ 
tities,  etc.  Especially  the  system  of  figures,  letters,  and  signs  used  in 
arithmetic  and  algebra  to  express  number,  quantity,  or  operations. 

Notation  is  one  of  the  most  undervalued  idea  in  computer  science.  It  is 
prevalent  in  the  form  of  programming  languages,  but  typically  ignored  at 
any  higher  level.  A  good  notation  provides  the  shortest  distance  between 
the  idea  in  your  head  and  a  piece  of  paper. 

Florian  Cajori’s  two-volume  masterpiece  A  History  of  Mathematical 
Notations  (1928-1929)  points  out  that  scientific  progress  was  sometimes 
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held  back  for  years,  decades,  or  even  centuries  because  there  wasn’t  the 
right  notation  around  in  which  to  express  the  relevant  ideas.  Compare  Ro¬ 
man  and  Arabic  numerals  for  addition,  subtraction,  etc.  Consider  also  zero, 
the  decimal  point,  complex  numbers,  the  calculus.  Imagine  expressing  same¬ 
ness  in  quantity  before  Robert  Recorde’s  invention  of  the  equality  sign.  The 
effect  that  notation  has  on  facilitating  problem-solving  is  aptly  summarised 
by  Alfred  North  Whitehead  as  follows:  “By  relieving  the  mind  of  all  un¬ 
necessary  work,  a  good  notation  sets  it  free  to  concentrate  upon  more 
advanced  problems” . 


0.3)  Specification,  Implementation  and  Verification 

The  concepts  and  methods  of  computational  modelling  and  thinking  are  rel¬ 
evant  to  many  different  fields,  but  their  foremost  domain  is  the  development 
of  high  quality  and  dependable  software.  To  set  the  scene,  we  briefly  discuss 
three  tasks  that  are  central  to  software  development  and  its  formalisation 
through  computational  modelling. 

1.  Specification  refers  to  the  task  of  modelling  a  computing  system 
together  with  its  functionality  and  behaviour.  This  can  be  understood 
as  a  formal  description  of  a  problem  to  be  solved. 

2.  Implementation  refers  to  the  task  of  programming  the  specification 
so  that  it  can  be  executed  on  a  computer.  This  can  be  understood  as 
an  effective  solution  to  the  problem  posed  by  the  specification. 

3.  Verification  refers  to  the  task  of  rigorously  demonstrating  that  the 
implementation  does  indeed  respect  the  specification.  This  can  be 
understood  as  a  proof  that  the  implementation  does  indeed  solve  the 
problem  specified. 

The  development  of  mathematical  methods  that  formalise  these  three 
tasks  is  sometimes  considered  to  be  the  Holy  Grail  of  Software  Engineer¬ 
ing.  In  an  ideal  world,  such  methods  could  make  software  testing  obsolete 
and  software  bugs  history.  But  after  four  decades  of  research  on  such  meth¬ 
ods,  this  still  remains  an  ideal,  and  there  are  mathematical  results  about 
decision  problems,  program  termination  and  incompleteness  of  theories  that 
suggest  that  this  may  be  necessarily  so. 

However,  while  nobody  would  expect  mathematical  formalisation  to  solve 
all  problems  of  science  or  engineering,  mathematical  methods  and  tools  have 
significantly  contributed  to  the  success  of  these  disciplines.  The  situation 
is  similar  in  computing:  many  light-weight  mathematical  methods  for  mod¬ 
elling  computing  systems  have  already  made  their  way  into  industrial  ap¬ 
plications  from  programming  languages  to  design  and  analysis  tools  for  the 
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specification,  implementation  and  verification  of  software  systems.  By  de¬ 
veloping  and  adopting  such  industrial-strength  methods  to  avoid  errors,  the 
task  of  searching  for  and  repairing  software  bugs  will  hopefully  become  more 
and  more  unnecessary  -  or  at  least  simpler  and  more  routine  -  thus  making 
“Software’s  Chronic  Crisis”  of  system  failures  something  of  mere  historical 
interest. 


Part  I 


Mathematics  for 
Computer  Science 


Chapter  1 

Propositional  Logic 


Either  this  man  is  dead  or  my  watch  has  stopped. 

-  Groucho  Marx. 

Like  her  three  older  brothers  before  her,  little  Amanda  always  wants  to 
know  “Why”:  “Why  do  I  have  to  go  to  school?”  “Why  does  it  only  snow 
in  Winter?”  As  young  as  she  is,  she  can  understand  that  -  logically  -  the 
responses  she  gets  satisfy  each  and  every  one  of  her  queries:  “You  go  to 
school  to  learn  things.”  “It  only  snows  in  Winter  because  that’s  the  only 
time  it  gets  cold.”  However,  these  answers  rarely  satisfy  her  -  they  merely 
open  the  way  for  yet  more  queries  to  explain  the  reasons  she  gets  as  answers 
to  her  previous  questions:  “Why  do  I  have  to  learn  things?”  “Why  does 
it  only  get  cold  in  the  Wintertime?”  Her  impatient  father  rarely  wins  this 
game;  it  inevitably  ends  either  with  a  definitive  “Just  because!”  or,  more 
usually,  with  a  simple  “Gee,  I  don’t  know,  that’s  a  very  good  question!  Go 
ask  your  mother.” 

This  behaviour  demonstrates  more  than  mere  curiosity;  and  in  fact  cu¬ 
riosity  typically  has  little  to  do  with  it.  It  is  the  fun  of  the  game  of  logical 
reasoning  which  motivates  her:  the  pursuit  of  the  absolute,  unquestionable 
premises  from  which  all  the  other  points  follow.  Her  father’s  goal  in  this 
game,  of  course,  is  to  identify  these  premises  as  quickly  as  possible.  (Her  true 
goal,  one  can’t  help  but  feel,  is  to  get  her  father  to  give  up  in  exasperation.) 

It  is  in  our  nature  as  human  beings  to  reason  about  the  world  and  our 
existence,  to  assimilate  the  knowledge  which  we  accumulate  and  to  make 
logical  deductions  based  on  this  knowledge.  Despite  the  fact  that  we  are 
born  with  a  built-in  propensity  to  apply  logical  rules  to  make  deductions 
from  our  knowledge  -  if  we  do  something  potentially  dangerous  such  as  step 
out  into  the  street  without  looking  for  cars,  then  we  may  get  hurt,  and 
therefore  we  shouldn’t  do  such  things  -  it  is  nonetheless  the  case  that  we 
are  very  bad  at  doing  this  correctly  consistently.  The  problem  lies  to  a  great 
extent  with  the  ambiguities  in  our  language. 

In  this  chapter  we  shall  see  how  logically  correct  reasoning  manifests 
itself  in  a  multitude  of  ways,  and  we  shall  learn  how  to  tame  our  use  of 
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language  in  order  to  prevent  the  types  of  ambiguities  and  mismatches  which 
lead  to  the  sorts  of  invalid  logical  arguments  which  all  too  typically  underly 
system  failures.  We  will  see  that  precise  rules  of  logical  reasoning  can  be 
written  down  and  mechanically  applied  like  the  rules  of  chess.  But,  due  to 
their  universality  as  laws  of  thought,  they  are  much  more  than  a  mere  formal 
game.  They  can  be  applied  to  model  and  reason  about  a  huge  variety  of 
systems  and  situations.  In  particular,  they  can  be  very  useful  in  detecting 
unexpected  misbehaviour  or  inconsistency  of  computing  systems. 

Logic  in  fact  lies  at  the  very  core  of  computing.  Historically,  the  concepts 
of  computation  and  effective  computability  have  been  developed  from  a 
logical  basis  and  they  were  motivated  by  questions  about  the  mathematical 
foundations  of  logic.  All  computer  programming  languages  rely  on  logical 
notions  in  their  specifications,  their  implementations  and  their  constructs. 
Logics  are  also  among  the  most  popular  and  effective  methods  for  specifying 
and  analysing  computational  systems  in  formally  rigorous  ways.  And,  last 
but  not  least,  the  design  and  implementation  of  digital  systems  is  strongly 
based  on  logic. 


CtA)  Propositions  and  Deductions 

Consider  the  following  argument. 

1.  Either  this  man  is  dead  or  my  watch  has  stopped. 

2.  My  watch  is  still  ticking. 

Therefore 

3.  This  man  is  dead. 

This  is  an  example  of  the  sort  of  reasoning  which  we  (mostly  unconsciously) 
perform  constantly  all  day  long.  If  we  analyse  the  structure  of  the  argument, 
we  see  the  following  elements. 

A.  The  argument  involves  three  statements,  or  propositions,  by  which 
we  mean  declarations  which  are  either  true  or  false  (but  not  both). 
Each  of  the  statements  in  the  argument  is  declared  to  be  true. 

B.  The  first  statement  expresses  an  option  between  two  simpler  state¬ 
ments,  namely 

la.  This  man  is  dead. 
or 

lb.  My  watch  has  stopped. 

C.  A  deduction  or  inference  is  made  to  infer  the  truth  of  the  third 
statement  from  the  truth  of  the  first  two  statements.  The  third  state¬ 
ment  is  referred  to  as  the  conclusion  of  the  argument,  while  the  first 
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two  statements  from  which  we  draw  the  conclusion  are  referred  to  as 
the  premises  of  the  argument. 

Such  arguments  can  be  formalised  in  propositional  logic.  The  syntax 
(structure)  of  propositional  logic  provides  a  language  for  modelling  systems, 
situations  and  arguments.  The  semantics  (meaning)  of  propositional  logic 
gives  an  interpretation  to  the  symbols  of  the  language.  The  language  of 
propositional  logic  starts  with  atomic  propositions,  such  as  “This  man 
is  dead”,  and  builds  up  larger  compound  propositions  using  a  variety  of 
propositional  connectives,  such  as  “or”.  Each  connective  is  given  a 
precise  prescribed  meaning  which  aims  to  reflect  its  everyday  use  in  natural 
language.  The  purpose  of  this  formalization  is  to  remove  ambiguities  which 
are  prevalent  in  the  use  of  English  or  any  other  natural  language. 


The  following  rules,  adapted  from  those  specified  by  the  World  Chess  Fed¬ 
eration  FIDE,  describe  the  conditions  for  castling.  Castling  is  a  move  of 
the  king  and  either  rook  of  the  same  colour,  counting  as  a  single  move  of  the 
king,  and  executed  as  follows:  the  king  is  transferred  from  its  original  square 
two  squares  towards  the  rook  in  question,  and  then  that  rook  is  transferred 
to  the  square  which  the  king  has  just  crossed. 

1.  The  right  for  castling  with  a  particular  rook  has  been  lost: 

(a)  if  the  king  has  already  moved;  or 

(b)  if  the  rook  in  question  has  already  moved. 

2.  Castling  with  a  particular  rook  is  prevented: 

(a)  if  the  right  for  castling  with  that  rook  has  been  lost;  or 

(b)  if  there  is  a  piece  between  the  king  and  the  rook  in  question;  or 

(c)  if  the  square  on  which  the  king  stands,  or  the  square  which  it 
must  cross,  or  the  square  which  it  is  to  occupy,  is  under  attack 
by  one  or  more  of  the  opponent’s  pieces. 

The  conditions  that  permanently  or  temporarily  prevent  castling  use  the 
propositional  connectives  “or”  and  “if"  to  express  constraints  under  which 
castling  is  prohibited. 

Arguments  are  all  about  truth.  Therefore,  not  all  sentences  can  take  part 
in  arguments,  simply  because  not  all  sentences  express  statements  which  can 
be  true  or  false.  This  is  the  case  with  questions  like  “Is  that  man  dead?” 
and  requests  like  “Bring  me  a  watch  that  works.  ”  To  be  true  or  false, 
a  sentence  must  state  a  potential  fact,  hence  be  related  to  a  potential  bit 
of  reality.  This  criterion  distinguishes  statements  or  propositions  from  all 
other  kinds  of  sentences. 
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(^Exercise  1.1?)  (Solution  on  page  405) 

Which  of  the  following  are  statements  (propositions)? 

1.  2+3=5. 

2.  2+3=6 

3.  Do  your  homework,  Joel! 

4.  Joel  didn’t  do  his  homework. 

5.  Is  there  life  on  Mars? 

6.  False 

7.  What  Felix  says  is  false. 

8.  What  this  sentence  says  is  false. 


Each  atomic  statement  can,  of  course,  be  further  analysed  with  respect  to 
its  grammatical  structure  -  Joel,  for  instance,  is  a  subject  noun,  do  a  verb, 
and  homework  an  object  noun  -  but  this  is  of  no  relevance  to  propositional 
logic.  It  is  concerned  solely  with  the  distinction  between  logical  and  non- 
logical  components  and,  correspondingly,  with  the  way  in  which  the  truth 
of  simpler  statements  determines  that  of  more  complex  ones. 

(^Exercise  1.2?  (Solution  on  page  405) 

Which  of  the  following  are  valid  deductions? 

1.  If  the  fire  alarm  sounds,  then  everyone  must  leave  the  building. 

Everyone  is  leaving  the  building. 

Therefore  the  fire  alarm  has  sounded. 

2.  If  the  fire  alarm  sounds,  then  everyone  must  leave  the  building. 

The  fire  alarm  has  sounded. 

Therefore  everyone  is  leaving  the  building. 

3.  If  the  signal  is  green,  then  the  train  may  proceed. 

The  signal  is  red. 

Therefore  the  train  must  wait. 

4.  The  right  for  castling  with  a  particular  rook  has  been  lost  if  the  king 

has  already  moved. 

Both  rooks  have  already  moved. 

Therefore  the  right  for  castling  with  a  particular  rook  has  been  lost. 

5.  The  right  for  castling  with  a  particular  rook  has  been  lost  if  the  king 

has  already  moved,  or  if  the  rook  in  question  has  already  moved. 

One  of  the  two  rooks  has  already  moved. 

Therefore  the  right  for  castling  with  a  particular  rook  has  been  lost. 
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6.  It  is  unlawful  for  any  person  to  keep  more  than  three  dogs  and  three 
cats  on  their  property  within  the  city. 

Charles  keeps  five  dogs  (but  no  cats)  on  his  property  in  the  city. 
Therefore  Charles  is  breaking  the  law. 


(^Exercise  1.3)  (Solution  on  page  406) _ 

Which  of  the  following  are  valid  deductions? 

1.  Epimenides  is  a  Cretan. 

All  Cretans  are  liars. 

Therefore  Epimenides  is  a  liar. 

2.  Epimenides  is  a  Cretan. 

Epimenides  said  that  “All  Cretans  are  liars.” 
Therefore  Epimenides  is  a  liar. 

3.  Epimenides  is  a  Cretan. 

Epimenides  said  that  “All  Cretans  are  liars.” 
Therefore  all  Cretans  are  liars. 

4.  Epimenides  is  a  Cretan. 

Epimenides  said  that  “All  Cretans  are  liars.” 
Therefore  not  all  Cretans  are  liars. 

5.  Epimenides  is  a  Cretan. 

Aristotle  said  that  “All  Cretans  are  liars.” 
Therefore  Epimenides  is  a  liar. 


(L2)  The  Language  of  Propositional  Logic 

The  syntax  of  propositional  logic  is  the  formal  definition  of  the  language, 
the  object  language  of  formal  logic.  This  definition  is  given  in  a  meta¬ 
language  -  natural  language  in  this  case  -  in  which  we  speak  about  the 
language  of  propositional  logic.  The  metalanguage  itself  will  use  logical  no¬ 
tions  and  reasoning,  albeit  at  an  informal  level;  since  the  levels  can  be  kept 
separate,  there  should  be  no  conceptual  confusion. 

The  definition  of  syntax  has  two  steps.  In  the  first  step,  the  basic  sym¬ 
bols  of  the  language  are  defined.  In  the  second  step,  the  rules  for  writ¬ 
ing  formula;  with  these  symbols  is  defined;  these  represent  statements  or 
propositions.  The  precise  definition  of  a  formula  will  be  given  at  the  end  of 
this  section;  we  first  introduce  the  components  of  this  definition  informally. 
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1.2.1  Propositional  Variables 

In  propositional  logic,  the  meaning  of  a  particular  atomic  proposition  is  given 
solely  by  its  truth  or  falsity.  We  therefore  abstract  from  these  propositions 
and  introduce  propositional  variables  instead. 

In  algebra,  variables  such  as  x,  y  and  2  are  used  to  represent  unknown 
numbers.  The  occurrences  of  the  variable  x  in  the  quadratic  equation  x2  + 
2a;  —  15  =  0  are  place  holders  for  some  value,  in  this  case  a  number.  The 
equation  restricts  the  admissible  values  of  x  to  being  either  3  or  —5.  That 
is,  if  3  is  substituted  for  every  occurrence  of  x  in  the  equation,  or  if  —5 
is  substituted  for  every  occurrence  of  x  in  the  equation,  then  the  equation 
holds;  and  when  any  other  number  is  substituted,  it  doesn’t. 

We  use  variables  in  a  similar  way  in  propositional  logic.  Propositional 
variables  such  as  P,Q,R, . . .  represent  unknown  propositions.  In  algebra 
we  may  assign  a  specific  value  to  a  variable;  for  example,  we  might  write 
“let  x=3”  and  then  interpret  every  subsequent  occurrence  of  the  variable  x 
by  the  value  3.  Similarly  we  may  let  a  propositional  variable  represent  a 
specific  proposition,  for  example  writing  “let  Dead  represent  the  statement: 
This  man  is  dead.”  (Following  good  programming  style,  we  will  typically 
use  meaningful  words  as  propositional  variables  rather  than  mere  letters  to 
obtain  more  readable  statements.) 

In  algebra,  values  (including  unknown  values  represented  by  variables) 
can  be  combined  using  various  operations,  such  as  addition  (+),  subtraction 
(— ),  multiplication  (x)  and  division  (-^).  In  propositional  logic,  we  may 
combine  propositions  using  various  propositional  connectives ,  specifi¬ 
cally  “not”  (-1),  “or”  (v),  “and”  (A),  “if  . . .  then  . . .”  (=>),  and  “. . .  if,  and 
only  if,  . . .”  (o).  An  informal  description  of  the  connectives  of  propositional 
logic  is  given  in  the  follow  sections. 

1.2.2  Negation 

The  negation  ~^p  of  a  statement  p,  pronounced  “not  p" ,  is  a  statement 
which  is  true  if,  and  only  if,  p  is  false.  This  is  typically  expressed  in  English 
in  one  of  the  following  ways: 

•  not  p;  (more  precisely,  the  statement  p  with  “not”  modifying  the 
verb,  typically  by  appearing  immediately  after  it.) 

•  p  does  not  hold  /  is  not  true  /  is  false; 

•  it  is  not  the  case  that  p. 


If  Dead  stands  for  the  statement  “This  man  is  dead,”  then  ^Dead  says  “It 
is  not  the  case  that  this  man  is  dead,”  or,  equivalently,  “This  man  is  not 
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dead.” 


If  a  proposition  is  not  true,  then  it  must  be  false;  and  conversely,  if  it  is 
not  false,  then  it  must  be  true.  In  particular  then,  if  a  proposition  is  not 
not  true,  then  it  is  true:  is  the  same  as  p.  This  is  referred  to  as  the 

Law  of  Double  Negation. 

Exercise  1.4^)  (Solution  on  page  406) _ 

Rewrite  the  following  statements  without  negations  at  the  start. 

1.  -i  “The  Earth  revolves  around  the  sun.” 

2.  -i  “All  of  my  children  are  boys.” 

3.  -i(2  +  2  <  4). 


1.2.3  Disjunction 

The  disjunction  p  V  q  of  two  statements  p  and  q,  pronounced  “p  or  q” ,  is 
a  statement  which  is  true  if,  and  only  if,  p  is  true  or  q  is  true  (or  indeed 
if  both  are  true);  that  is,  at  least  one  of  p  and  q  is  true.  This  is  typically 
expressed  in  English  in  one  of  the  following  ways: 


•  p  or  q; 

•  p  or  q  or  both; 

•  p  and/or  q; 

•  p  unless  q. 

In  the  context  of  the  disjunction  p  V  q,  the  propositions  p  and  q  are  individ¬ 
ually  referred  to  as  disjuncts. 


Example  1  Aj 

If  Dead  stands  for  the  statement  “This  man  is  dead”  and  Watch  stands  for 
the  statement  “My  watch  has  stopped,”  then  Dead  V  Watch  says  “Either 
this  man  is  dead  or  my  watch  has  stopped,”  or,  equivalently,  “If  this  man 
is  alive,  then  my  watch  must  have  stopped.”  This  does  not  preclude  the 
possibility  that  the  man  is  dead  and  my  watch  has  stopped,  in  which  case 
Dead  V  Watch  will  still  be  true. 


Example 


In  chess,  the  right  for  castling  with  a  particular  rook  has  been  lost  if  the 
king  has  already  moved,  or  if  the  rook  in  question  has  already  moved.  This 
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condition  can  be  formalised  as  KingMoved  V  RookMoved,  where  KingMoved 
and  RookMoved  are  propositional  variables  stand  for  the  statements  “The 
king  has  moved”  and  “The  rook  has  moved,”  respectively.  In  particular, 
therefore,  one  may  not  castle  with  a  particular  rook  if  both  the  king  and 
the  rook  in  question  have  already  been  moved. 


Recalling  that  ^p  is  true  if  p  is  not  true,  we  can  note  that  jVnp  must 
always  be  true  regardless  of  what  proposition  p  stands  for:  either  p  is  true,  or 
it  is  not  true.  This  fact  is  referred  to  as  the  Law  of  the  Excluded  Middle: 
there  is  no  middle  ground  when  it  comes  to  the  truth  of  a  propositional 
formula. 

(^Exercise  1.5^)  (Solution  on  page  406) _ 

Are  the  following  disjunctions  true  or  false? 

1.  (3  <  2)  V  (3  <  5) 

2.  (5  <  4)  V  (7  <  5) 

3.  (5  <  6)  V  (6  <  8) 


Note  that  p  V  q  is  true  if  (though  not  only  if)  both  p  and  q  are  true. 
In  propositional  logic,  there  can  be  no  ambiguity:  the  “or”  is  always  taken 
in  this  inclusive  sense.  In  some  everyday  circumstances,  however,  “or”  is 
used  in  the  exclusive  sense:  the  statement  “Either  you  be  quiet  now  or 
you  won't  get  an  ice  cream!”  certainly  is  not  supposed  to  be  true  in  the 
case  in  which  the  child  under  consideration  is  quiet  but  still  doesn’t  get 
the  ice  cream  -  that  would  be  an  unfair  trick.  Such  an  “exclusive  or”  is 
in  fact  provided  by  a  different  connective  from  the  (inclusive)  “or”  used 
in  propositional  logic;  it  is  written  ffi,  and  it  has  its  own  different  truth 
conditions:  p  ©  q  is  true  if,  and  only  if,  one  of  p  and  q  is  true  and  the  other 
is  false;  that  is,  precisely  one  of  p  and  q  is  true.  Note  that  this  connective  is 
not  formally  a  part  of  the  definition  of  propositional  logic;  however,  it  can 
be  expressed  using  the  connectives  of  propositional  logic  (see  Example  1.10 
on  page  29). 

(^Exercise  1.6)  (Solution  on  page  406) _ 

For  each  of  the  following  disjunctive  statements,  decide  whether  you  think 
the  speaker  intends  to  use  the  inclusive  or  exclusive  sense  of  the  disjunction. 

1.  Joel  came  in  last  place  in  the  round-robin  competition;  so  that  mean 
that  either  Felix  beat  him  or  Oskar  beat  him. 

2.  The  light  is  either  on  or  off. 

3.  You  can  have  tea  or  coffee. 
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1.2.4  Conjunction 

The  conjunction  p  A  q  of  two  statements  p  and  q,  pronounced  “p  and  q” , 
is  a  statement  which  is  true  if,  and  only  if,  both  p  and  q  are  true.  This  is 
typically  expressed  in  English  in  one  of  the  following  ways: 

•  p  and  q; 

•  p  but  q; 

•  not  only  p  but  also  q. 

In  the  context  of  the  conjunction  p  A  q,  the  propositions  p  and  q  are  indi¬ 
vidually  referred  to  as  conjuncts. 


Example 


If  Dead  stands  for  the  statement  “This  man  is  dead”  and  Watch  stands  for 
the  statement  “My  watch  has  stopped,”  then  Dead  A  Watch  says  “This  man 
is  dead  and  my  watch  has  stopped,”  or,  equivalently,  “Not  only  is  this  man 
dead,  but  so  is  my  watch!” 


Recalling  that  -i p  is  false  if  p  is  true,  we  can  note  that  pf\-tp  must  always 
be  false  regardless  of  what  proposition  p  stands  for:  p  and  -i p  cannot  both 
be  true  at  the  same  time. 

(^Exercise  1.7^)  (Solution  on  page  407) _ 

Are  the  following  conjunctions  true  or  false? 

1.  (3  <  2)  A  (3  <  5) 

2.  (5  <  4)  A  (7  <  5) 

3.  (5  <  6)  A  (6  <  8) 


1.2.5  Implication 

Given  two  statements  p  and  q,  the  implication  p  =>  q,  pronounced  “p 
implies  q” ,  is  a  statement  which  is  true  if,  and  only  if,  p  is  false,  or  q  is  true; 
that  is,  if  p  is  true  then  q  must  also  be  true.  In  other  words,  p  =>  q  is  false 
if,  and  only  if,  p  is  true  and  q  is  false.  This  is  typically  expressed  in  English 
in  one  of  the  following  ways: 

•  p  implies  q; 

•  if  p  then  q: 

•  i  ^  p; 

•  p  only  if  q; 
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•  q  whenever  p; 

•  p  is  a  sufficient  condition  for  q; 

•  q  is  a  necessary  condition  for  p. 


In  the  context  of  the  implication  p  =>  q,  p  is  referred  to  as  the  premise  and 
q  is  referred  to  as  the  conclusion. 


Let  the  variable  SignalDanger  stand  for  the  statement  “The  signal  shows 
danger,”  and  let  the  variable  TrainStop  stand  for  the  statement  “The  train 
stops.”  Then  SignalDanger  =>■  TrainStop  stands  for  the  statement  “If  the 
signal  shows  danger  then  the  train  stops.” 

The  only  event  in  which  this  statement  can  be  false  is  when  the  signal 
shows  danger  and  yet  the  train  does  not  stop.  Hence  the  rule  allows  the  case 
that  the  signal  does  not  show  danger  and  yet  the  train  nevertheless  stops. 


Exercise  1.8)  (Solution  on  page  407) 


Letting  JoelHappy  stand  for  “Joel  is  happy”  and  AmandaHappy  stand  for 
“Amanda  is  happy,”  each  of  the  following  statements  translates  as  either 
JoelHappy  =>  AmandaHappy  or  as  AmandaHappy  =>  JoelHappy.  Determine 
which  in  each  case. 


1.  “Joel  is  happy  whenever  Amanda  is  happy.” 

2.  “Joel  is  happy  only  if  Amanda  is  happy.” 

3.  “Joel  is  happy  unless  Amanda  is  not  happy.” 


Should  a  potential  thief  necessarily  be  concerned? 
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1.2.6  Equivalence 

The  equivalence  p  o  q  of  two  statements  p  and  q,  pronounced  “p  if,  and 
only  if,  q” ,  is  a  statement  which  is  true  if,  and  only  if,  both  p  and  q  are  true, 
or  both  p  and  q  are  false;  that  is,  if  p  and  q  have  the  same  truth  value.  This 
is  typically  expressed  in  English  in  one  of  the  following  ways: 

•  p  if,  and  only  if,  q; 

•  p  is  equivalent  to  q; 

•  p  is  a  necessary  and  sufficient  condition  for  q. 

The  symbol  for  equivalence  o  looks  like  the  symbol  for  implies  =>  point¬ 
ing  in  both  directions.  This  is  very  much  by  design  since,  with  a  bit  of 
thought,  it  is  evident  that  p  o  q  is  true  if,  and  only  if,  p  =>  q  and  p  -£=  q 
(that  is,  q  =>  p)  are  both  true. 

(^Example  l.fT) 

Let  the  variable  TrainEnter  stand  for  the  statement  “The  train  enters  the 
tunnel,”  and  let  the  variable  TunnelClear  stand  for  the  statement  “The  tunnel 
is  clear.”  Then  TrainEnter  o  TunnelClear  stands  for  the  statement  “The  train 
enters  the  tunnel  if,  and  only  if,  the  tunnel  is  clear.” 

This  statement  is  false  if  the  train  enters  the  tunnel  while  the  tunnel  is 
not  clear,  or  if  the  tunnel  is  clear  but  the  train  does  not  enter. 


1.2.7  The  Syntax  of  Propositional  Logic 

We  can  now  summarise  the  above  discussion  of  propositional  logic  in  the 
following  formal  definition.  A  statement  written  in  propositional  logic  is 
called  a  propositional  formula,  and  is  either: 

•  an  atomic  formula,  typically  represented  by  a  variable  such  as  P,  Q 
or  R;  or 

•  a  compound  formula,  in  which  case  it  is  built  up  using  the  above 
propositional  connectives  as  summarised  in  Figure  1.1. 

There  are  two  special  atomic  propositional  formulae,  true  (representing  the 
proposition  which  is  always  true)  and  false  (representing  the  proposition 
which  is  always  false). 

The  above  defines  the  formal  syntax  of  the  language  of  propositional 
formulae.  To  emphasise  that  a  propositional  formula  must  be  written  syn¬ 
tactically  correctly  according  to  Figure  1.1,  it  is  also  referred  to  as  a  well- 
formed  formula  ( wjf ). 

Note  that  in  Figure  1.1  (as  well  as  throughout  this  whole  chapter)  the 
letters  p  and  q  are  not  propositional  variables,  but  rather  metavariables 
which  stand  for  arbitrary  propositions. 
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If  p  and  q  are  propositional  formulae,  then  so  are  the  following: 


true 

truth 

false 

falsity 

P 

atomic  proposition 

=p 

not  p 

negation 

P  v  q 

p  or  q 

disjunction 

phq 

p  and  q 

conjunction 

V  =>  9 

ifp  then  q 

implication 

p&q 

v  if.  and  only  if.  a 

equivalence 

Figure  1.1:  The  formulae  of  propositional  logic. 


1.2.8  Parentheses  and  Precedences 

It  is  common  to  use  parentheses  when  writing  mathematical  expressions  such 
as  (5  +  3)  x  2,  in  order  to  disambiguate  such  expressions.  Most  mathemati¬ 
cians  (as  well  as  many  hand-held  calculators)  will  calculate  5  +  3x2  =  11, 
as  it  is  standard  to  consider  multiplication  as  binding  more  tightly  than 
addition;  that  is,  multiplications  are  applied  before  additions  whenever  pos¬ 
sible.  Multiplication  is  said  to  have  a  higher  precedence  than  addition. 
However,  with  parentheses  the  meaning  of  this  expression  changes  dramat¬ 
ically:  (5  +  3)  x  2  =  16.  Similarly,  we  would  use  parentheses  to  calculate 
5  —  (3  —  1)  =  5  —  2  =  3,  as  without  them  we  would  naturally  apply  the 
subtractions  left-to-right  and  calculate  5  —  3  —  1  =  2  —  1  =  1. 

In  a  similar  vein  we  can  and  will  regularly  make  use  of  parentheses  within 
propositional  formulae  to  ensure  that  the  meaning  of  our  formulae  is  clear. 
For  example,  the  formula  P  V  Q  =+  R  can  be  read  either  as  (P  V  Q)  =>  R  or 
as  P  V  (Q  =>  R ),  so  we  shall  write  the  formula  with  parentheses  in  one  of 
the  above  ways  in  order  to  make  sure  it  is  read  as  intended.  We  shall  thus 
extend  our  definition  of  a  well-formed  formula  to  include  parentheses  which 
enclose  subformulae. 

However,  to  reduce  the  need  for  parentheses,  we  will  consider  -  as  bind¬ 
ing  more  tightly  than  A,  which  will  bind  more  tightly  than  V,  which  will 
bind  more  tightly  than  =+  which  will  bind  more  tightly  than  o.  Apart 
from  this,  the  connectives  will  be  applied  right-to-left,  so  that  for  example 
an  expression  of  the  form 

p  =>  q  A  r  =f>  s 
would  be  interpreted  as 
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p  =>  (q  A  r)  =>■  s 

due  to  A  binding  more  tightly  than  =>,  and  thus  as 
p  ((?A  r)  =>  s'j 

due  to  the  right-to-left  application  order  of  the  =>  connectives. 

Omitting  parentheses  by  adopting  the  above  precedence  and  application 
orders  on  connectives  will  often  make  formulae  easier  to  read.  However, 
parentheses  can  and  should  still  be  used  despite  these  conventions  in  cases 
when  confusions  can  easily  arise.  For  example,  we  will  typically  write 

p  =>  ((?Ar)  =>  s) 

despite  the  redundancy  of  the  parentheses. 

^Example  1.1(f) _ 

We  can  express  the  “exclusive  or’’  operation  p  ©  q  -  which  says  that  one  of 
p  and  q  is  true  and  the  other  is  false  -  as  a  simple  equivalence,  by  noting 
that  p  ©  q  says  that  one  of  p  and  q  is  true  if,  and  only  if,  the  other  is  not 
true.  It  can  thus  be  defined  simply  by: 

p  ©  q  =  p  ^  ^q 
or,  equivalently,  by 

p  ©  q  =  ^p  O  q. 

Both  of  these  options  abide  by  the  hint  that  p  ©  q  says  that  one  of  p  and  q 
is  true  if,  and  only  if,  the  other  is  not  true. 

You  may  be  tempted  to  define  it  as 

p  ©  q  =  (p  O  ~^q)  A  (5  e  ~^p) 

which  would  be  correct,  but  this  would  be  overkill;  with  a  little  thought  you 
should  realise  that  pflnj  is  the  same  as  go  ~^p. 

(^Exercise  1.1(f)  (Solution  on  page  407) _ 

Express  the  following  connectives  using  the  connectives  of  propositional 
logic. 

1.  The  NAND  connective  p  \  q  which  is  true  if,  and  only  if,  p  and  q  are 
not  both  true. 

2.  The  NOR  connective  piq  which  is  true  if,  and  only  if,  neither  p  nor  q 
are  true. 

3.  The  conditional  connective  q<\p\>r  which  is  true  if,  and  only  if,  either 
p  and  q  are  both  true,  or  -i p  and  r  are  both  true.  In  other  words:  “If 
p  is  true  then  q  must  be  true;  otherwise  r  must  be  true.  ” 
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1.2.9  Syntax  Trees 

It  can  be  helpful  to  view  a  well-formed  propositional  formula  as  a  tree¬ 
like  diagram,  called  a  syntax  tree,  in  which  the  tree  structure  reflects 
the  way  in  which  the  formula  is  constructed.  For  example,  the  formula 
( P  V  Q')  — .■  A  Q')  corresponds  to  the  following  syntax  tree. 


To  recognise  the  expression  (PvQ)  =>  -i(P  A  Q)  as  a  well-formed  proposi¬ 
tional  formula,  we  need  only  break  it  down  to  its  constituent  parts,  and  to 
reconstruct  it  from  the  inside  out: 


•  P  and  Q,  being  propositional  variables,  are  propositional  formulae. 

•  Since  P  and  Q  are  propositional  formulae,  so  too  are  their  disjunction 
P  V  Q  and  conjunction  P  A  Q. 

•  Since  P  /\Q  is  a  propositional  formula,  so  too  is  its  negation  -i(P  A  Q)- 

•  Since  P  V  Q  and  -i (P  A  Q )  are  propositional  formula,  so  too  is  their 
implication  (P  V  Q)  =>  -<(P  A  Q). 


This  decomposition  is  directly  reflected  in  the  syntax  tree,  and  also  provides 
a  method  for  determining  whether  or  not  the  formula  is  true. 

The  syntax  tree  makes  it  clear  how  the  expression  should  be  parsed, 
without  the  need  for  parentheses  or  precedence  rules  to  tell  the  reader  how 
to  interpret  the  formula.  Without  the  rules  of  precedence,  there  are  many 
different  ways  to  read  the  expression  P  V  Q  =>  ~^P  A  Q,  all  of  which  having 
completely  different  meanings  and  syntax  trees. 


Consider  the  expression  P^^QwR^Q.  According  to  the  precedence 
rules,  it  is  represented  by  the  following  syntax  tree: 
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In  order  to  evaluate  this  expression  -  that  is,  to  determine  its  truth  value 
-  we  first  need  to  know  the  truth  values  of  the  propositional  variables  P, 
Q  and  R.  We  then  compute  -<Q,  as  -i  binds  more  tightly  than  the  other 
connectives;  then  (-iQ)  V  R  is  computed,  as  V  binds  more  closely  than  =>; 
then  ((-iQ)  V  R)  =L  Q  is  computed  followed  by  P  =>  (((-i Q)  V  R)  =>  Q), 
since  the  two  =>  connectives  are  computed  in  a  left-to-right  order. 

Fully  bracketed,  the  formula  is  thus  interpreted  as 

P  ^  (((-Q)Vi?)  .^Q). 


^Example  1.12^) 

The  string  of  symbols  -i (P  A  (Q  V  ->))  is  not  a  well-formed  propositional 
formula.  This  can  be  seen  by  applying  the  formation  rules  in  Figure  1.1 
backwards. 

.  -n(P  A  (Q  V  -i))  is  a  formula  only  if  (P  A  (Q  V  ->))  is  a  formula. 

•  (P  A  (Q  V  -i))  is  a  formula  only  if  P  and  ( Q  V  -t)  are  formulae. 

•  P  is  a  propositional  variable  and  is  therefore  a  formula. 

•  (Q  V  -i)  is  a  formula  only  if  Q  and  -i  are  formulae. 

•  Q  is  a  propositional  variable  and  is  therefore  a  formula. 

•  However,  ->  is  a  logical  connective;  it  is  neither  a  propositional  variable 
nor  a  compound  formula,  so  it  is  not  a  formula. 

•  Therefore,  -i  (P  A  (Q  V  -i))  is  not  a  well-formed  formula. 


Exercise  1.12 J  (Solution  on  page  407) _ 

Which  of  the  following  are  well-formed  formulae?  Rewrite  each  well-formed 
formula  using  a  minimal  number  of  parentheses  without  changing  its  mean¬ 
ing,  and  draw  its  syntax  tree. 

1.  ((P  =k  Q)  o  (Q  =>  P)). 

2.  PVQ(AP). 

3.  (PVQ)AP. 

4.  (P  V  Q)  o  (R  -i  5)). 

5.  (Pv(QAfl))  O  (P  V  (Q  A  (P  V  P)))i 
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(L3)  Modelling  with  Propositional  Logic 

Propositional  logic  is  very  important  for  modelling  real-life  scenarios,  in 
which  we  define  propositional  variables  to  represent  particular  properties 
which  may  be  true  or  false.  Indeed  we  have  described  many  such  examples 
already  above.  We  shall  here  consider  a  few  further  such  examples. 

(^Example  1.13^) _ 

A  particular  computer  program  contains  the  following  lines  of  code: 


if  CabinPressure  <  MinPressure  then  PrepareForLanding; 
if  FlightHeight  <  MinHeight  then  PrepareForLanding; 


A  software  engineer  assessing  this  code  proposes  that  it  could  be  optimised 
as  follows: 


if  (CabinPressure  <  MinPressure  and  FlightHeight  <  MinHeight) 
then  PrepareForLanding; 


Is  this  correct? 

Logically,  we  can  use  the  variables  Pressure  and  Height  to  express  the 
two  conditions  that  signal  a  need  to  land;  and  the  variable  Land  to  express 
the  execution  of  PrepareForLanding.  The  program  then  gives  rise  to  the 
following  propositional  formula: 

(Pressure  =>  Land)  A  (Height  =>  Land) 
while  the  suggested  optimisation  corresponds  to 

(Pressure  A  Height)  =>  Land. 

The  formula  corresponding  to  the  program  is  false  if,  and  only  if,  either 
Pressure  is  true  and  Land  is  false,  or  Height  is  true  and  Land  is  false;  this  is 
the  case  if,  and  only  if,  either  Pressure  or  Height  is  true  while  Land  is  false. 

The  formula  for  the  suggested  optimisation,  on  the  other  hand,  would 
only  be  false  if  both  Pressure  and  Height  are  true  while  Land  is  false;  for  exam¬ 
ple,  having  the  cabin  pressure  drop  below  its  minimum  allowed  value  would 
wrongly  not  cause  the  aeroplane  to  prepare  for  landing  if  the  aeroplane  is 
cruising  above  its  minimum  allowed  height. 

The  correct  variant  of  the  propositional  formula  -  one  which  is  equivalent 
to  the  formula  corresponding  to  the  program  -  would  be 

(Pressure  V  Height)  =>  Land. 
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That  is,  the  optimised  code  should  should  have  a  disjunction  (or)  in  the 
condition,  not  a  conjunction  (and).  Of  course  this  logical  analysis  only 
confirms  our  intuition:  The  aeroplane  should  prepare  for  landing  if  either 
condition  is  satisfied,  not  if  both  of  them  hold. 


Example  1.14J 

Consider  the  following  four  symbols:  a  white  circle,  a  black  circle,  a  white 
square,  and  a  black  square: 

o  •  □  ■ 

Let  B  represent  the  proposition  that  the  symbol  in  question  is  black,  and 
C  represent  the  proposition  that  the  symbol  in  question  is  a  circle. 

•  B  is  true  of  the  black  circle  and  the  black  square,  but  false  of  the  white 
circle  and  the  white  square. 

•  -i B  is  true  of  the  white  circle  and  the  white  square,  but  false  of  the 
black  circle  and  the  black  square. 

•  B  V  C  is  true  of  the  white  circle,  the  black  circle  and  the  black  square, 
but  false  of  the  white  square. 

•  B  A  C  is  true  of  the  black  circle,  but  false  of  the  white  circle,  the  white 
square  and  the  black  square. 

•  B  =>  C  is  true  of  the  white  circle,  the  black  circle  and  the  white  square, 
but  false  of  the  black  square. 

•  B  o  C  is  true  of  the  black  circle  and  the  white  square,  but  false  of 
the  white  circle  and  the  black  square. 

These  facts  are  summarised  in  the  table  in  Figure  1.2.  Almost  all  of  them 
are  self-evident,  though  you  should  spend  time  considering  carefully  when 
B  =>  C  is  true  and  when  it  is  not  true.  Specifically,  the  only  way  that  it 
can  be  false  is  if  the  symbol  in  question  is  black  yet  is  not  a  circle. 


The  Oxford  mathematician  Charles  Lutwidge  Dodgson  (1832-1898),  bet¬ 
ter  known  as  Lewis  Carroll,  the  author  of  Alice  in  Wonderland,  enjoyed 
inventing  puzzles  which  required  careful  logical  reasoning  to  solve.  The 
following  is  a  typical  example. 

(^Exercise  1.14^)  (Solution  on  page  408) _ 

Lewis  Carroll  concludes  that  “Amos  Judd  loves  cold  mutton”  from  the  fol¬ 
lowing  seven  assumptions: 

1.  All  the  policemen  on  this  beat  sup  with  our  cook. 
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(it’s  black) 

(it’s  not  black) 

(it’s  black  or  it’s  a  circle) 

(it’s  black  and  it’s  a  circle) 

(if  it’s  black  then  it’s  a  circle) 

(it’s  black  if  and  only  if  it’s  a  circle) 
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B  oC 

X 

V 

V 
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Figure  1.2:  B=“ the  symbol  is  black”,  C=“the  symbol  is  a  circle”.  | 

2.  No  man  with  long  hair  can  fail  to  be  a  poet. 

3.  Amos  Judd  has  never  been  in  prison. 

4.  Our  cook’s  cousins  all  love  cold  mutton. 

5.  None  but  policemen  on  this  beat  are  poets. 

6.  None  but  her  cousins  ever  sup  with  our  cook. 

7.  Men  with  short  hair  have  all  been  in  prison. 

Explain  how  Lewis  Carroll  can  draw  his  conclusion. 


Exercise  1.15J  (Solution  on  page  410) 

Translate  the  rules  for  castling  in  chess  presented  in  Example  1.1  into  propo¬ 
sitional  logic  using  the  following  propositional  variables: 

•  RightToCastleLeft  /  RightToCastleRight: 

You  have  the  right  to  castle  with  the  rook  to  the  left  /  right. 

•  MayCastleLeft  /  MayCastleRight: 

You  may  perform  a  castling  move  with  the  rook  to  the  left  /  right. 

•  KingMoved:  The  king  has  moved. 

•  LeftRookMoved  /  RightRookMoved: 

The  left  /  right  rook  has  moved. 

•  PieceBetweenLeft  /  PieceBetweenRight: 

There  is  a  piece  between  the  king  and  the  rook  to  the  left  /  right. 

•  KingAttack:  The  king  is  under  attack. 
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•  LeftSquareAttack  /  RightSquareAttack: 

The  square  to  the  left  /  right  of  the  king  is  under  attack. 

•  KingMoveLeftAttack  /  KingMoveRightAttack: 

The  square  two  to  the  left  /  right  of  the  king  is  under  attack. 


The  following  puzzle  may  appear  hard  at  first  sight,  but  it  becomes 
surprisingly  simple  when  approached  logically. 


Exercise  1.16 J  (Solution  on  page  410) 

Joel,  Felix  and  Oskar  give  Amanda  the  following  puzzle.  The  three  of  them 
each  write  their  name  on  a  piece  of  paper,  and  then  exchange  the  pieces  of 
paper  so  that  no  one  has  the  piece  with  their  own  name  on  it.  They  then 
hold  these  pieces  of  paper  so  that  Amanda  can’t  see  what’s  on  them,  but 
tell  her  that  each  has  the  name  of  one  of  the  others,  and  they  challenge  her 
to  figure  out  who  is  holding  each  name.  She  is  allowed  to  look  at  the  name 
written  on  any  one  piece  of  paper. 


1.  Give  a  propositional  formula  which  expresses  the  fact  that  each  boy 
holds  one  of  the  pieces  of  paper  but  no  one  holds  the  piece  of  paper 
with  their  own  name  on  it.  Use  the  following  propositional  variables 
to  do  this. 


JonF 

JonO 

FonJ 

FonO 

OonJ 

OonF 


“Joel”  is  on  Felix’s  paper. 
“Joel”  is  on  Oskar’s  paper. 
“Felix”  is  on  Joel’s  paper. 
“Felix”  is  on  Oskar’s  paper. 
“Oskar”  is  on  Joel’s  paper. 
“Oskar”  is  on  Felix’s  paper. 


2.  Suppose  Amanda  looks  at  Joel’s  paper  and  sees  “Oskar”  written  on  it. 
Use  the  formula  above  to  deduce  what  name  is  written  on  the  other 
two  pieces  of  paper. 
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Despite  their  intentionally  obfuscated  form,  the  statements  in  the  Amos 
Judd  puzzle  in  Exercise  1.14  are  precise  and  unambiguous.  There  are,  how¬ 
ever,  many  common  abuses  of  logical  arguments  arising  from  the  ambiguities 
of  a  natural  language  such  as  English.  In  the  following  examples  we  consider 
particular  difficulties  which  beginning  logicians  often  find  problematic. 
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Children  can  get  very  unruly  in  the  back  seat  of  the  family  car  during  long 
drives.  In  such  instances,  an  increasingly  exasperated  father  in  the  driving 
seat  might  find  himself  making  promises  such  as  the  following: 


“Everyone  who  sits  quietly  for  the  next  hour 
will  get  an  ice  cream  when  we  stop  for  petrol.” 

What  exactly  does  this  statement  say?  And  more  importantly,  does  it  ex¬ 
press  what  the  father  means  to  say?  You  might  well  imagine  that  he  wants 
to  suggest  that: 

“Anyone  who  misbehaves  will  not  get  ice  cream.” 

However,  this  does  not  follow  from  his  statement:  the  children  who  get  ice 
cream  will  include  those  who  sit  quietly,  but  may  well  include  the  noisy  ones 
as  well.  In  fact,  he  knows  that  even  greater  problems  of  retribution  will  arise 
later  on  during  the  drive  if  only  some  of  the  children  get  the  promised  ice 
cream,  so  it  is  always  his  unspoken  intention  that  all  of  the  children  will  get 
ice  cream,  regardless  of  their  behaviour  (within  reason). 

His  aim  in  making  the  statement  was  to  manipulate  language  to  his 
benefit,  as  well  as  to  provide  a  lesson  for  his  children  in  its  logical  use.  He 
was  being  intentionally  vague,  relying  on  his  children  to  misinterpret  his 
statement  as  saying  something  more  than  it  actually  does,  namely  that  any 
misbehaving  children  will  not  get  ice  cream.  When  in  the  end  even  the 
misbehaving  children  get  ice  cream,  those  that  sat  quietly  in  anticipation  of 
their  reward  would  be  mildly  upset  at  the  unfairness  of  it  all,  but  they  could 
not  argue  with  their  father’s  explanation  that  he  did  not  actually  say  that 
the  unruly  children  would  lose  out.  Without  a  doubt  he  spoke  the  truth. 

Needless  to  say,  this  strategy  would  not  work  for  very  long,  as  the  children 
will  quickly  become  keen  interpreters  of  any  statements  that  their  father 
makes. 


Suppose  a  menu  at  a  restaurant  states  the  following: 


“You  may  have  coffee  or  tea  with  your  meal.” 

This  clearly  expresses  a  disjunction  of  two  atomic  propositions: 

“You  may  have  coffee  with  your  meal 
or  you  may  have  tea  with  your  meal.” 

However,  does  it  really  do  this?  Clearly  the  intention  is  that  if  you  ask  for 
coffee,  then  you  will  be  served  coffee.  But  consider  the  following  scenarios. 
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1.  Suppose  the  coffee  maker  is  broken  on  the  day  you  visit,  and  only  tea 
is  available  that  day;  is  the  menu  wrong  in  this  case?  Certainly  not 
logically,  assuming  that  you  may  still  have  tea. 

2.  Suppose  the  restaurant  doesn’t  have  a  coffee  maker,  and  never  actually 
serves  coffee  at  all;  is  the  menu  wrong  in  this  case?  Still  as  certainly 
not  logically,  assuming  that  it  serves  tea. 

The  real  intention  of  the  proposition  on  the  menu  is  something  more  akin 
to  conjunction  rather  than  disjunction,  as  follows. 

“You  may  have  coffee  with  your  meal 
and  you  may  have  tea  with  your  meal.” 

However,  this  is  still  not  true  either,  as  it  is  unlikely  that  the  restaurant 
intends  to  allow  you  to  order  both  beverages  with  your  meal.  The  following 
proposition  might  be  a  more  accurate  interpretation  of  the  intended  option 
on  the  menu. 

“You  may  have  coffee  with  your  meal 
and  you  may  have  tea  with  your  meal, 
but  not  both.” 

Are  you  satisfied  with  this?  There  is  in  fact  still  something  seriously  wrong 
with  this  proposition.  To  see  this  clearly,  let  us  introduce  the  following  two 
atomic  propositions. 

A  =  You  may  have  coffee  with  your  meal. 

B  =  You  may  have  tea  with  your  meal. 

Then  the  above  proposition  is 

{A  A  B)  A  n(AAB). 

However,  this  proposition  is  of  the  form  p  A  ^p;  and  recalling  the  fact  noted 
after  Example  1.6  that  no  proposition  p  (such  asiAB)  can  be  true  at  the 
same  time  as  its  negation  ^p,  this  means  that  the  menu  is  giving  no  option 
whatsoever! 

The  problem  here  is  one  of  modality.  That  we  may  have  a  coffee,  and 
that  we  indeed  do  have  a  coffee,  are  different  propositions,  and  we  need  to 
be  careful  how  we  treat  such  modalities. 

To  correctly  formulate  the  option,  we  might  introduce  the  following  two 
atomic  propositions. 

C  =  You  have  coffee  with  your  meal. 

T  =  You  have  tea  with  your  meal. 

Then  the  option  stated  on  the  menu  would  stipulate  that  one,  and  only 
one,  of  these  atomic  propositions  are  true.  This  can  be  rendered  in  many 
(equivalent)  ways,  such  as 
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(C  V  T)  A  n(CAT) 

“You  have  coffee  with  your  meal 
or  you  have  tea  with  your  meal, 
but  not  both.” 

or 

(C  A  -iT)  V  (nCAT) 

“You  have  coffee  but  not  tea  with  your  meal 
or  you  have  tea  but  not  coffee  with  your  meal.” 

But  this  is  still  not  the  end  of  the  story.  Perhaps  a  particular  diner  drinks 
neither  coffee  nor  tea.  The  menu  surely  doesn’t  force  the  diner  to  accept 
one  of  these  beverages;  the  diner  surely  has  the  option  of  having  neither. 
The  option  on  the  menu  thus  is  merely  stipulating  the  following 

-i(C  A  T) 

“You  do  not  have  both  coffee  and  tea  with  your  meal.” 
or  equivalently 
~^C  V  ~~T 

“You  do  not  have  coffee  with  your  meal 
or  you  do  not  have  tea  with  your  meal.” 

Prom  this  simple  English  proposition  has  sprouted  a  plethora  of  compli¬ 
cations.  This  is  the  greatest  problem  in  formulating  the  design  of  systems, 
and  hence  of  getting  such  designs  correct. 


^Example  1.18^) 

If  p  is  false  then  by  definition  p  =>  q  is  true  regardless  of  the  truth  of  q. 
This  observation  gives  rise  not  so  much  to  a  problem  of  ambiguity,  but  to 
one  of  misunderstanding  and  confusion.  For  example,  assuming  that  Carlos 
is  an  ordinary  man  who  is  not  the  King  of  Spain,  the  following  proposition 
is  false: 

“If  Carlos  is  a  man,  then  Carlos  is  the  King  of  Spain.” 

However,  the  following  statement  is  true: 

“If  Carlos  is  a  woman,  then  Carlos  is  the  King  of  Spain.” 

Do  not  be  distracted  by  the  falsity  of  the  conclusion;  the  only  way  that 
the  above  statement  can  be  false  is  if  the  premise  is  true  whilst  the  conclu¬ 
sion  is  false.  It  is  unfortunately  a  common  misconception  that  the  above 
implication  is  false,  as  the  implication  should  be  as  follows: 
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“If  Carlos  is  a  woman,  then  Carlos  is  the  Queen  of  Spain.” 

This  statement  is  true  as  well,  for  precisely  the  same  reason  that  the  previous 
one  is  true:  the  premise  of  the  implication  is  false. 

Though  this  is  a  common  confusion,  it  is  well  understood  and  properly 
applied  in  several  instances  of  natural  language.  For  example,  the  statement 

“If  I  told  you  once,  I  told  you  a  hundred  times!” 

is  meant  to  convey  that  you  have  been  told  something  a  hundred  times 
(assuming  that  you’ve  been  told  once).  This  statement,  of  course,  is  typically 
false  due  to  an  intended  use  of  hyperbole  -  it  is  highly  unlikely  that  you 
have  been  told  something  so  many  times. 

As  another  example,  the  statement 

“If  he  ever  pays  me  back,  then  I’ll  be  a  monkey’s  uncle!” 

expresses  the  doubt  (i.e.,  falsity)  that  money  lent  will  ever  be  returned, 
by  concluding  an  obviously-false  conclusion  from  the  premise  which  is  being 
denied.  As  I  can  never  be  a  monkey’s  uncle,  the  only  way  that  this  statement 
can  be  true  is  if  he  never  pays  me  back. 


(^Example  l.llT) _ 

Suppose  your  teacher  says  the  following  to  you: 

“If  you  understand  implication,  then  you  will  pass  the  exam.” 

There  are  four  scenarios  to  consider: 

1.  Suppose  you  understand  implication,  and  you  pass  the  exam.  Clearly 
you  would  consider  the  above  statement  to  be  true. 

2.  Suppose  you  don’t  understand  implication,  and  you  fail  the  exam. 
Again  you  would  consider  the  above  statement  to  be  true,  and  you 
might  even  think  your  teacher  to  be  a  wise  sage.  However,  this  thought 
would  just  go  to  show  that  you  indeed  don’t  understand  implication. 
The  reason  you  failed  the  exam  is  not  (necessarily)  because  you  don’t 
understand  implication.  To  understand  this  point,  consider  the  next 
scenario. 

3.  Suppose  you  don’t  understand  implication,  but  nonetheless  you  pass 
the  exam,  because  you  understand  enough  of  the  rest  of  the  material. 
This  does  not  contradict  your  teacher’s  claim;  it  is  still  true. 

4.  Suppose,  finally,  that  you  understand  implication,  but  you  fail  the 
exam  nonetheless.  In  this  case  you  may  feel  angry  towards  your 
teacher,  since  he  was  obviously  lying  to  you.  (Of  course,  your  teacher 
would  maintain  that  it  is  you  who  are  lying,  in  claiming  that  you 
understand  implication.) 
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In  summary,  the  only  way  for  the  teacher’s  statement  to  be  false  is  if  the 
premise  is  true  (i.e.,  you  understand  implication)  while  the  conclusion  is 
false  (you  fail  the  exam). 


(^Exercise  1.19J)  (Solution  on  page  411) _ 

Consider  the  following  four  symbols:  a  white  circle,  a  black  circle,  a  white 
square,  and  a  black  square: 

o  •  □  ■ 

I  have  in  mind  one  of  these  four  symbols.  I  will  accept  any  symbol  which 
either  has  the  same  colour  or  the  same  shape  (or  both)  as  the  one  I  have 
in  mind,  and  otherwise  I  will  reject  it.  If  I  accept  the  black  square,  what 
does  this  suggest  to  you  about  whether  I  accept  or  reject  the  other  three 
symbols? 


(^Exercise  1.2lP)  (Solution  on  page  411) _ 

If  two’s  a  company  and  three’s  a  crowd,  what’s  four  and  five? 


(L5)  Truth  Tables 

By  thinking  carefully  about  the  logical  connectives,  we  can  informally  un¬ 
derstand  their  intended  meanings.  However,  we  still  need  to  express  these 
meanings  precisely;  that  is,  we  need  to  define  the  meaning  of  the  con¬ 
nectives.  In  doing  this,  the  semantics  of  propositional  logic  is  formally, 
rigorously  and  unambiguously  defined. 

One  way  in  which  we  can  do  this  concisely  is  by  explicitly  listing  out 
the  truth  values  which  a  compound  formula  takes  for  each  of  the  possible 
combinations  of  truth  values  of  its  constituent  propositions.  A  table  which 
contains  this  listing  is  called  a  truth  table. 

For  example,  negation  ~^p  can  be  defined  by  specifying  its  truth  value 
for  each  of  the  two  possible  truth  values  of  p:  if  the  truth  value  of  p  is  true, 
then  the  truth  value  of  -i p  will  be  false;  and  if  the  truth  value  of  p  is  false, 
then  the  truth  value  of  -i p  will  be  true.  For  ease  of  presentation,  we  shall 
reserve  the  symbols  T  for  true  and  F  for  false.  The  truth  table  for  negation 
is  thus  as  follows. 


V 

“ \p  ' 

F 

~T~ 

lT 

FJ 
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The  remaining  four  connectives  are  similarly  defined  by  the  following  truth 
tables,  which  all  have  four  rows  corresponding  to  the  four  distinct  combina¬ 
tions  of  truth  values  for  the  two  propositions  p  and  q  being  combined  using 
the  connectives. 


Truth  tables  can  also  be  used  to  understand  far  more  complicated  for¬ 
mulae,  such  as  in  the  following  example. 


Consider  the  statement  from  Example  1.16  made  by  a  certain  father: 


“Everyone  who  sits  quietly  for  the  next  hour 
will  get  an  ice  cream  when  we  stop  for  petrol.” 

Let  us  define  the  following  atomic  propositions. 

Quiet  =  You  sit  quietly. 

Ice  =  You  get  an  ice  cream. 

For  you,  as  a  perfectly  logical  child,  the  above  statement  translates  to 
Quiet  =>  Ice  -  if  you  remain  quiet  then  you  will  get  an  ice  cream  -  which  has 
the  following  truth  table: 


The  only  scenario  in  which  the  above  statement  can  be  considered  false  is 
if  Quiet  is  true  and  Ice  is  false  -  that  is,  if  you  do  not  get  an  ice  cream 
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despite  being  quiet;  in  this  instance  you  would  be  justified  in  being  angry 
with  your  father  for  lying  to  you.  However,  your  father,  being  trustworthy, 
would  never  allow  this  scenario. 

It  is  tempting  to  be  angry  that  your  noisy  siblings  also  get  ice  cream. 
However,  there  is  no  justification  in  this  based  on  the  above  statement.  As 
is  clear  from  the  second  row  of  the  truth  table,  the  statement  is  true  even  in 
the  instance  that  a  noisy  child  gets  an  ice  cream.  It  is  a  common  pitfall  to 
interpret  p4-gasp«?  (that  is,  to  understand  from  the  above  statement 
that  you  will  get  an  ice  cream  if,  and  only  if,  you  are  quiet),  and  to  believe 
that  p  =>  5  implies  that  q  =>  p  (that  is,  to  understand  from  the  above 
statement  that  you  will  not  get  an  ice  cream  if  you  are  not  quiet). 

The  above  statement  is  giving  you  a  guarantee  that  you  will  get  an  ice 
cream  if  you  are  quiet  -  and  therefore  you  best  be  quiet.  If  you  are  not 
quiet,  then  there  is  no  guarantee  that  you  will  get  an  ice  cream,  but  there 
is  no  guarantee  that  you  won’t! 


Exercise  1.21  j  (Solution  on  page  411) _ 

Recall  the  statement  from  Example  1.19  made  by  a  certain  teacher: 


“If  you  understand  implication,  then  you  will  pass  the  exam.” 


Translate  this  statement  into  a  propositional  formula,  and  use  its  truth  table 
to  justify  when  it  is  true  or  false. 


Catherine  wishes  to  go  to  a  party  tonight,  and  would  be  happy  to  go  with 
either  Jim  or  Jules.  However,  as  she  is  currently  dating  both  Jim  and  Jules, 
she  doesn’t  want  to  go  to  the  party  if  they  will  both  be  there. 

Let  us  define  the  following  atomic  propositions. 

Cat  =  Catherine  goes  to  the  party. 

Jim  =  Jim  goes  to  the  party. 

Jules  =  Jules  goes  to  the  party. 

Catherine’s  predicament  then  can  be  formalised  as  follows. 

Cat  =>  ^(Jim  A  Jules). 

This  proposition  states  that  Catherine  goes  to  the  party  only  if  Jim  and 
Jules  don’t  both  go  to  the  party.  We  can  determine  when  this  proposition 
is  true  or  false  by  building  up  a  truth  table  based  on  all  possible  values  of 
the  atomic  propositions  Cat,  Jim  and  Jules,  and  the  values  of  the  constituent 
propositions  which  make  up  the  complete  proposition.  The  resulting  truth 
table  is  as  follows. 
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Cat 

Jim 

Jules 

^(Jim  A  Jules) 

Jim  A  Jules  Cat  =>  ^(Jim  A  Jules) 

F 

F 

F 

F 

T 

T 

F 

F 

T 

F 

T 

T 

F 

T 

F 

F 

T 

T 

F 

T 

T 

T 

F 

T 

T 

F 

F 

F 

T 

T 

T 

F 

T 

F 

T 

T 

T 

T 

F 

F 

T 

T 

T 

T 

T 

T 

F 

F 

The  first  three  columns  systematically  list  out  the  eight  distinct  combina¬ 
tions  of  truth  values  for  the  three  propositions  Cat,  Jim  and  Jules;  the  next 
column  applies  the  rules  from  the  truth  table  for  A  to  the  columns  for  Jim 
and  Jules;  the  next  column  applies  the  rules  for  -i  to  the  column  just  con¬ 
structed;  and  the  final  column  applies  the  rules  for  =>  to  the  columns  for 
Cat  and  ^(Jim  A  Jules).  Prom  this  we  can  discover  that  the  proposition  is 
true  in  all  cases  except  when  all  three  atomic  propositions  are  true;  that  is, 
it  is  false  if,  and  only  if,  all  three  participants  in  this  love  triangle  go  to  the 
party. 

As  a  point  of  interest,  we  can  build  truth  tables  in  a  more  concise  way 
which  entails  writing  the  proposition  of  interest  along  the  top  row  of  the 
truth  table,  and  filling  in  columns  defined  by  the  propositional  variables 
and  connectives,  working  from  the  “inside  out.”  The  truth  table  for  the 
above  example  would  then  be  rendered  as  follows: 


Cat 

Jim 

Jules 

Cat 

=> 

— i 

(Jim 

A 

Jules) 

F 

F 

F 

F 

T 

T 

F 

F 

F 

F 

F 

T 

F 

T 

T 

F 

F 

T 

F 

T 

F 

F 

T 

T 

T 

F 

F 

F 

T 

T 

F 

T 

F 

T 

T 

T 

T 

F 

F 

T 

T 

T 

F 

F 

F 

T 

F 

T 

T 

T 

T 

F 

F 

T 

T 

T 

F 

T 

T 

T 

T 

F 

F 

T 

T 

T 

T 

F 

F 

T 

T 

T 

(0) 

(0) 

(0) 

(1) 

(4) 

(3) 

(1) 

(2) 

(!)  ^ 

The  bottom  row  of  numbers  is  included  in  this  example  to  indicate  at  what 
stage  each  column  was  filled  in: 

(0)  The  three  initial  columns  are  filled  in,  representing  all  8  possible  com¬ 
binations  of  truth  values  for  the  atomic  propositions  Cat,  Jim  and  Jules. 

(1)  The  columns  for  the  propositional  variables  are  then  filled  in  during 
the  first  stage. 
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(2)  After  this  the  column  for  Jim  A  Jules  is  filled  in  (under  the  A  symbol) 
during  the  second  stage. 

(3)  Then  the  column  for  ^(Jim  A  Jules)  is  filled  in  (under  the  -i  symbol) 
during  the  third  stage. 

(4)  Finally  the  column  for  Cat  =>  ^(Jim  A  Jules)  is  filled  in  (under  the  => 
symbol)  during  the  fourth  stage. 

Each  column  is  computed  by  referring  to  columns  which  have  been  computed 
in  earlier  stages. 


(^Exercise  1.22)  (Solution  on  page  412) _ 

How  many  rows  will  there  be  in  a  truth  table  involving  four  propositional 
variables  P,  Q,  R  and  5?  What  if  there  are  five  propositional  variables? 
What  if  there  are  n  propositional  variables? 


(^Exercise  1.23^)  (Solution  on  page  412) _ 

Construct  truth  tables  for  the  following  propositions. 

1.  -n(Po-,Q). 

2.  ( PA  <2 )  V  (nPAn<3). 

3.  (PAQ)  =>  (nfiVS). 
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P)  (Solution  on  page  413) 


The  “exclusive  or”  operation  p  ©  q  has  the  following  truth  table: 


p  ©  q 


That  is,  p  ©  q  is  true  if,  and  only  if,  one  of  p  and  q  is  true  and  the  other  is 
false. 

Confirm  that  the  formula  you  gave  in  Example  1.10  (page  29)  for  ex¬ 
pressing  p  ©  q  in  propositional  logic  gives  the  same  truth  table. 
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1.6)  Equivalences  and  Valid  Arguments 

We  have  seen  that  a  given  proposition  can  be  expressed  as  a  formula  in 
propositional  logic  in  different  yet  equivalent  ways.  As  a  further  example, 
the  formula 

Cat  =>  ^(Jim  A  Jules)  “If  Cat  then  not  both  of  Jim  and  Jules.  ” 
from  Example  1.21  is  equivalent  to  the  formula 

-n(Cat  A  Jim  A  Jules)  “Cat,  Jim  and  Jules  cannot  all  be  true.  ” 
as  well  as 

^Cat  V  ^Jim  V  Jules.  “One  of  Cat,  Jim  or  Jules  is  false. 

To  verify  that  two  compound  formulae  p  and  q  are  equivalent,  we  could 
construct  truth  tables  for  p  and  q  and  observe  that  they  have  the  same 
truth  values  under  all  interpretations  of  their  respective  atomic  propositions. 
Alternatively  we  could  build  the  truth  table  for  the  formula  p  <s=>  q  and 
observe  that  it  is  true  under  all  interpretations.  If  so,  the  two  propositions 
p  and  q  are  said  to  be  logically  equivalent. 

A  proposition  which  is  true  regardless  of  the  truth  values  of  its  atomic 
propositions  is  called  a  tautology,  and  the  proposition  is  said  to  be  valid.  A 
contradiction  on  the  other  hand  is  a  proposition  which  is  false  regardless  of 
the  truth  values  of  its  atomic  propositions,  and  is  said  to  be  unsatisfiable. 
A  proposition  which  is  true  under  some  interpretation  of  its  atomic  propo¬ 
sitions  -  that  is,  one  that  is  not  a  contradiction  -  is  said  to  be  satisfiable. 


(^Example  1.24)) _ 

Any  formula  of  the  form  p  V  ~^p  is  a  tautology,  while  any  of  the  form  p  A  ~^p 
is  a  contradiction.  These  facts  were  noted  already  in  Section  1.2,  and  can 
be  verified  formally  by  constructing  the  truth  tables  for  these  formulae. 


p 

J 

< 

J 

(  P 

J 

> 

J 

•X3 

F 

T  T 

F 

T  F 

T 

F 

T 

F  F  J 

Each  entry  in  the  column  for  p  V  ~^p  is  true,  confirming  that  p  V  ^p  is  a 
tautology,  while  each  entry  in  the  column  for  p  A  ~^p  is  false,  confirming  that 
p  A  ^p  is  a  contradiction. 

Note  that  if  we  take  p  =  A  A  B,  then  the  contradiction 
p  A  —*p  =  (AaB)  A  ^(A  A  B ) 
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is  precisely  the  formula  which  appeared  in  Example  1.17  (page  37). 


Exercise  1.25)  (Solution  on  page  413) 


Construct  truth  tables  for  each  of  the  following  formulae  to  determine  which 
are  tautologies  and  which  are  contradictions. 


1.  p  V  (^p  A  q). 

2.  (pAg)  A  -i(pVg). 

3.  (p  =4>  ~^p)  o  -i p . 

4.  (p  g)  p. 

•r>-  V  > 


Tautologies  are  important  in  ascertaining  the  validity  of  arguments.  Con¬ 
sider,  for  example,  our  first  argument  from  Section  1.1  (page  18): 

1.  Either  this  man  is  dead  or  my  watch  has  stopped. 

2.  My  watch  is  still  ticking. 

Therefore 

3.  This  man  is  dead. 

This  argument  is  valid  if  the  conjunction  of  the  two  premises  implies  the 
conclusion,  that  is,  if  the  following  implication  is  valid: 

(Dead  V  Watch)  A  ^Watch  =4>  Dead 

Again,  this  means  that  the  proposition  is  a  tautology,  that  it  is  true  regard¬ 
less  of  the  truth  values  of  its  atomic  propositions.  We  can  easily  confirm 
this  by  constructing  a  truth  table  for  this  proposition: 


Dead 

Watch 

(  Dead  V  Watch  ) 

A 

“I 

Watch 

Dead 

F 

F 

F 

F 

F 

F 

T 

F 

T 

F 

F 

T 

F 

T 

T 

F 

F 

T 

T 

F 

T 

F 

T 

T 

F 

T 

T 

F 

T 

T 

T 

T 

T 

T 

T 

F 

F 

T 

T 

T 

In  contrast,  consider  the  argument  suggested  by  Exercise  1.9  (page  26): 

1.  If  my  dog  barks,  then  my  dog  doesn’t  bite. 

2.  My  dog  doesn’t  bark. 

Therefore 

3.  My  dog  bites. 
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Its  formalisation  yields  the  following  truth  table: 


f  Barks 

Bites 

(  Barks 

— 1 

Bites  ) 

A 

— 1 

Barks 

Bites  ' 

F 

F 

F 

T 

T 

F 

T 

T 

F 

F 

F 

F 

T 

F 

T 

F 

T 

T 

T 

F 

T 

T 

T 

F 

T 

T 

T 

F 

F 

F 

T 

T 

F 

V  T 

T 

T 

F 

F 

T 

F 

F 

T 

T 

T 

The  first  row  of  this  truth  table  shows  that  the  proposition  -  and  hence  the 
argument  it  represents  -  is  not  valid.  It  presents  a  scenario  in  which  the 
proposition  may  be  false:  a  dog  that  neither  barks  nor  bites  satisfies  both 
premises,  but  not  the  conclusion.  Such  a  dog  provides  a  counterexample 
to  the  validity  of  the  argument. 


Exercise  1.26 J  (Solution  on  page  414) _ 

In  Example  1.13  we  represented  a  piece  of  computer  program  in  proposi¬ 
tional  logic  as: 

p  =  (Pressure  =>  Land)  A  (Height  =+  Land). 

We  also  considered  two  optimisations  of  this  program  represented  as 

q  =  Pressure  A  Height  =+  Land; 
r  =  Pressure  V  Height  =+  Land. 

Of  course,  an  optimisation  is  only  correct  if  the  representation  of  the  opti¬ 
mised  program  code  is  equivalent  to  the  original  one.  Explain  which  of  the 
two  optimisations  is  correct  and  which  is  not. 


(L7)  Algebraic  Laws  for  Logical  Equivalences 

Using  truth  tables  to  prove  properties  about  propositions,  specifically  that 
two  propositions  are  equivalent,  can  quickly  become  tedious.  However,  we 
can  avoid  relying  on  truth  tables  by  reasoning  equationally  much  as  we 
would  do  in  algebra  and  arithmetic. 

For  example,  we  might  conclude  that  3  x  (4+5)  =  27  in  the  following 
way: 

3  x  (4  +  5)  =  (3x4)  +  (3x5) 

=  12  +  15  =  27. 

In  the  first  line  of  this  calculation  we  used  the  algebraic  law  that  says  that 
multiplication  distributes  over  addition:  a(b+c )  =  ab+ac\  and  in  the  second 
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line  we  used  the  principle  that  we  can  replace  equals  by  equals:  if  a  —  b  and 
c  =  d  then  a  +  c  =  b  +  d. 

A  similar  kind  of  reasoning  is  possible  with  propositional  logic,  with 
equivalence  o  playing  the  role  of  equality  =.  Once  we  have  determined  that 
two  propositions  p  and  q  are  equivalent,  that  p  ej,  we  can  then  replace 
one  with  the  other.  First,  though,  we  need  to  know  what  equivalences  we 
can  use  as  our  ’’algebraic  laws”.  A  large  number  of  these  are  given  as  follows. 


Commutativity  Laws 
p  V  q  o  q  V  p 

Associativity  Laws 
p  V  (q  V  r)  o  (pVg)Vr 

Ldempotence  Laws 
p  V  P  op 

Distributivity  Laws 
pV(gAr)  O  (pVg)A(pVr) 

De  Morgan’s  Laws 
-1  (p  Vg)  O  npAng 

Double  Negation  Law 

^  ^p  O  p 

Tautology  Laws 
p  V  true  o  true 

Contradiction  Laws 
pV  false  o  p 

Excluded  Middle  Laws 
pVnp  O  true 

Absorption  Laws 
p  V  (p  A  q)  o  p 

Lmplication  Law 
p  O  q  O  npVg 


p  A  g  o  gAp 


p  A  (g  A  r)  O  (pAg)Ar 


p  Ap  O  p 


pA(gVr)  O  (pAg)v(pAr) 


-i(p  Ag)  O  npVng 


p  A  true  o  p 


p  A  false  o  false 


pAnp  o  false 


p  A  (p  V  g)  O  p 
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Contrapositive  Law 

p  =>  q  <^>  ~^q  =>  ^p 


Equivalence  Law 
p&q  <>  (p  =>  g)  A  (g  =>  p) 

You  can  (and  should)  show  that  all  of  the  above  laws  are  valid  tautologies 
by  constructing  appropriate  truth  tables.  However,  some  laws  can  be  shown 
to  be  valid  by  using  laws  that  have  already  been  previously  confirmed.  For 
example,  we  can  verify  the  validity  of  the  Contrapositive  Law  as  follows: 


o 

o 


ipV  q 
qV  -ip 
-i  —>q  v  ~^p 

-i q  =>  -i p 


(Implication) 

(  Commutativity ) 
(Double  Negation) 
(Implication) 


Of  course,  this  derivation  relies  on  the  Implication,  Commutativity  and 
Double  Negation  Laws  being  verified  first. 

More  importantly,  we  can  use  the  above  equivalences  to  derive  ever  more 
equivalences,  bypassing  the  need  to  construct  truth  tables  to  justify  them. 


(^Example  1.2(T) _ 

We  can  derive  the  equivalence  p  V  (-1  p  A  q)  o  p  V  q  using  the  following 
sequence  of  steps: 


p  V  (-1  p  A  q) 


<^> 

(p  V  -ip)  A  (p  V  q) 

( Distribwtivity ) 

<^> 

true  A  (p  V  q) 

(Excluded  Middle) 

<^> 

(p  V  q)  A  true 

(  Commutativity ) 

<^> 

pVq 

(  Tautology ) 

We  can  equally  use  this  technique  to  verify  that  a  proposition  p  is  a 
tautology  by  demonstrating  that  p  o  true. 

(^Example  1.27^) _ 


We  can  demonstrate  that  (p  =>  q)  V  {q  =>  r )  is  a  tautology  as  follows: 
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(p  =>  ?)  V  (g  =A  r ) 

4=>  (-ip  V  q)  V  (-i q  V  r) 

(Implication,  twice) 

O  ~<p  V  ((g  V  ->g)  V  r) 

(Associativity,  twice) 

O  ^p  V  (true  V  r) 

(Excluded  Middle) 

O  (->p  V  r)  V  true 

(Commutativity,  Associativity) 

■O  true 

(Tautology) 

As  in  algebra,  we  will  usually  not  mention  applications  of  associativity  and 
commutativity,  and  write  formulae  like  pVgVr  instead  of  p  V  (g  V  r)  or 
(p  V  g)  V  r.  This  allows  us  to  represent  the  above  calculation  in  a  more 
compact  way  as  follows: 

(P=>Q)  V  (q  =>  r) 

O  ^p  V  q  V  ^q  V  r 

(Implication,  twice) 

O  ^p  V  true  V  r 

(Excluded  Middle) 

O  true 

(  Tautology ) 

(^Exercise  1.27^)  (Solution  on  page  415) 

Give  derivations  of  the  following  equivalences. 


1.  p  A  (-i p  V  q) 

2.  ^(p  =4-  q)  O 

3.  p  =4>  (q  V  r) 

4.  p  =4>  (q  A  r) 

5.  (p  A  g)  =>  r 

6.  (p  V  g)  =>  r 


O  pAg. 
p  A  ~^q. 

O  (p  =>  (?)  V  (p  =>  r). 

O  (p  =>  g)  A  (p  =>  r). 

O  (p  r)  V  (g  =A  r). 

O  (p=tr)  A  (g=^  r). 
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1.  Which  of  the  following  are  statements? 

(a)  “17  is  an  odd  integer.” 

(b)  “Manchester  is  the  capital  of  Great  Britain.” 

(c)  “Unload  the  dishwasher  if  it  has  completed  its  washing  cycle.” 

(d)  “Are  all  roses  red?” 
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(e)  “All  roses  are  red.” 

2.  Negate  each  of  the  items  from  above  that  you  determine  to  be  state¬ 
ments. 

3.  Which  of  the  following  are  valid  deductions? 

(a)  Mammals  are  warm-blooded  animals. 

Whales  are  mammals. 

Therefore  whales  are  warm-blooded  animals. 

(b)  Mammals  are  warm-blooded  animals. 

Fish  are  not  mammals. 

Therefore  fish  are  not  warm-blooded  animals. 

(c)  Some  doctors  are  surgeons. 

Some  women  are  doctors. 

Therefore  some  women  are  surgeons. 

(d)  All  horses  are  animals. 

Therefore  all  horses’  heads  are  animal  heads. 

(e)  Some  girls  are  better  than  others. 

Therefore  some  girls’  mothers  are  better  than  other  girls’  mothers. 

4.  Formalise  the  following  statement  of  Sherlock  Holmes  in  propositional 
logic: 

“If  I’m  not  mistaken  Watson,  that  was  the  Dore  and  Totley 
tunnel  through  which  we  have  just  come,  and  if  so  we  shall 
be  in  Sheffield  m  a  few  minutes.  ” 

5.  Let  E  and  T  and  W  represent  the  following  propositions. 

E:  Your  laptop’s  warranty  has  expired. 

T:  You  have  tampered  with  the  electronics  in  your  laptop. 

W :  Your  laptop  is  covered  by  its  warranty. 

(a)  Translate  the  following  statements  into  propositional  logic. 

Wi.  Your  laptop  is  covered  by  its  warranty  as  long  as  the  warranty 
has  not  expired  and  you  have  not  tampered  with  the  laptop’s 
electronic  components. 

W2:  Your  laptop  is  not  covered  by  its  warranty  if  the  warranty  has 
expired  or  if  you  have  tampered  with  the  laptop’s  electronic 
components. 

(b)  How  do  these  two  statements  differ?  Which  one  would  you  prefer 
to  see  on  the  warranty  of  your  new  laptop? 

6.  Given  that  P  and  R  are  true  while  Q  is  false,  determine  the  truth 
values  of  the  following  formulas.  Verify  these  by  building  truth  tables 
for  the  given  formulae. 
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(a)  PA(Qvfl) 

(b)  (PAQ)vfl 

(c)  n(PAfl)  A  R 

(d)  -i P  V  -i(- 1 Q  A  7?) 

7.  Write  each  of  the  following  statements  symbolically  in  the  form  P=M? 
(using  the  suggested  propositional  variables),  and  then  express  them 
in  English  in  the  form  “If  . . .  then  . . 

(a)  I  will  play  golf  tomorrow  (G)  unless  it  rains  (R). 

(b)  I’ll  do  it  ( D )  if  you  ask  me  nicely  (JV). 

(c)  Ann  cries  (C)  every  time  she  watches  The  Titanic  (W). 

(d)  I  never  leave  the  house  (A)  without  locking  the  door  ( D ). 

(e)  A  rectangle  is  a  square  (S)  only  if  all  four  of  its  sides  are  the  same 
length  (A). 

(f)  A  rectangle  is  a  square  (5)  if  all  four  of  its  sides  are  the  same 
length  (A). 

8.  Letting  CatAway  stand  for  “The  cat’s  away”  and  MicePlay  stand  for 
“The  mice  will  play,”  translate  each  of  the  following  into  propositional 
logic. 

(a)  “The  mice  will  play  whenever  the  cat’s  away.” 

(b)  “The  mice  will  play  only  if  the  cat’s  away.” 

(c)  “The  mice  will  play  unless  the  cat’s  not  away.” 


9.  Suppose  I  lay  the  following  four  cards  on  the  table,  each  of  which  has 
a  shape  on  one  side  (either  a  circle  or  a  square)  and  a  pattern  on  the 
other  side  (either  stripes  or  dots). 


I  claim  that: 

“Every  card  with  a  circle  on  one  side 
always  has  stripes  on  the  other  side.” 

Which  card(s)  do  you  need  to  turn  over  in  order  to  be  certain  that  I 
am  telling  the  truth? 

This  exercise  is  known  as  a  Wason  Selection  Test  after  the  psychol¬ 
ogist  Peter  Wason  who  first  described  it  in  1966.  Be  careful  with  your 
answer:  studies  rarely  result  in  a  reported  success  rate  of  over  20%! 
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10.  Explain  the  difference  between  the  following  three  offers: 

(a)  You  can  watch  TV  if  you  tidy  your  room. 

(b)  You  can  watch  TV  only  if  you  tidy  your  room. 

(c)  You  can  watch  TV  if,  and  only  if,  you  tidy  your  room. 

Which  offer  should  a  logical  parent  make  to  their  children? 

11.  Give  the  truth  tables  defining  the  NAND,  NOR  and  conditional  con¬ 
nectives  p  |  q,  p  !  q  and  q  <  p  >  r  defined  in  Exercise  1.10,  and  show 
that  these  are  the  same  as  the  truth  tables  for  the  formulae  you  gave 
in  Exercise  1.10  for  these  connectives. 

12.  Propositional  Logic  is  based  on  the  three  connectives  ->,  V  and  A; 
the  Implication  Law  and  the  Equivalence  Law  show  that  the  two 
connectives  =>  and  o  can  be  defined  in  term  of  the  other  three. 

(a)  Show  how  to  express  -i p ,  p  V  q  and  p  A  q  using  only  the  NAND 
connective  |. 

(b)  Show  how  to  express  -i p ,  p  V  q  and  p  A  q  using  only  the  NOR 
connective  f. 

13.  A  friend  proposes  the  following  game  to  you.  You  keep  tossing  a  coin 
over  and  over  until  one  of  the  following  two  things  happens: 

•  if  two  heads  occur  in  a  row,  then  the  game  ends;  you  win,  and 
your  friend  will  give  you  £2\ 

•  if  a  tail  occurs  followed  immediately  by  a  head,  then  the  game 
ends;  your  friend  wins,  and  you  must  give  your  friend  £1. 

Is  it  worth  playing  this  game? 

14.  In  a  certain  country,  every  inhabitant  is  either  a  truth  teller  who  always 
tells  the  truth,  or  a  liar  who  always  lies.  While  travelling  in  this 
country,  you  meet  two  people,  Abe  and  Ben.  Abe  says,  “Ben  and  I  are 
both  liars.”  Is  Abe  a  truth  teller  or  a  liar?  What  about  Ben? 

15.  Argue  that  Superman  doesn’t  exist.  To  do  this,  start  by  making  the 
following  four  assumptions: 

X1:  If  Superman  were  able  and  willing  to  prevent  evil,  he  would  do 
so. 

X2.  Superman  does  not  prevent  evil. 

X3:  If  Superman  were  unable  to  prevent  evil,  he  would  be  impotent; 

and  if  he  were  unwilling  to  prevent  evil,  he  would  be  malevolent. 
Xt:  If  Superman  exists,  he  is  neither  impotent  nor  malevolent. 

Argue  as  follows.  First  introduce  the  following  variables: 
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A:  “Superman  is  able  to  prevent  evil.” 

W:  “Superman  is  willing  to  prevent  evil.” 

I:  “Superman  is  impotent.” 

M:  “Superman  is  malevolent.” 

P:  “Superman  prevents  evil.” 

E:  “Superman  exists.” 

(a)  The  first  assumption  translates  into  the  following  formal  logical 
statement: 

X4  :  (A  AW)  =L  P. 

Translate  the  remaining  assumptions  X2,  X3  and  X4  into  formal 
logical  statements. 

(b)  Use  assumptions  X2  and  X2  to  argue  that  -i A  V  -W . 

(c)  Use  assumption  X3,  and  the  fact  from  (b),  to  argue  that  /  V  M. 

(d)  Use  assumption  X4,  and  the  fact  from  (c),  to  draw  your  conclu¬ 
sion. 

16.  Which  of  the  following  statements  is  true? 

(a)  All  of  the  below. 

(b)  None  of  the  below. 

(c)  All  of  the  above. 

(d)  One  of  the  above. 

(e)  None  of  the  above. 

(f)  None  of  the  above. 

17.  The  following  famous  puzzle  is  referred  to  as  the  Einstein  Riddle  as 
Albert  Einstein  is  sometimes  credited  with  inventing  it  as  a  boy.  He 
is  also  credited  with  claiming  that  only  two  percent  of  the  world’s 
population  can  solve  it. 

You  are  given  the  following  information  about  five  houses  sitting  in  a 
row  on  some  street  which  are  each  painted  a  different  colour,  and  whose 
inhabitants  are  of  different  nationalities,  own  different  pets,  drink  dif¬ 
ferent  beverages,  and  smoke  different  brands  of  American  cigarettes. 
In  statement  (e),  right  refers  to  the  reader’s  right. 

(a)  The  Englishman  lives  in  the  red  house. 

(b)  The  Spaniard  owns  the  dog. 

(c)  Coffee  is  drunk  in  the  green  house. 

(d)  The  Ukrainian  drinks  tea. 

(e)  The  green  house  is  immediately  to  the  right  of  the  ivory  house. 

(f)  The  Old  Gold  smoker  owns  snails. 
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(g)  Kools  are  smoked  in  the  yellow  house. 

(h)  Milk  is  drunk  in  the  middle  house. 

(i)  The  Norwegian  lives  in  the  first  house. 

(j)  Chesterfields  are  smoked  next  door  to  the  man  with  the  fox. 

(k)  Kools  are  smoked  next  door  to  the  house  where  the  horse  is  kept. 

(l)  The  Lucky  Strike  smoker  drinks  orange  juice. 

(m)  The  Japanese  smokes  Parliaments. 

(n)  The  Norwegian  lives  next  to  the  blue  house. 

The  question  is:  Who  drinks  water?  Who  owns  the  zebra? 

18.  Verify  the  Laws  of  Equivalence  from  Section  1.7,  either  directly  by 
using  truth  tables,  or  by  deriving  them  from  previous  laws  which  have 
already  been  verified. 

19.  Verify  the  following  laws  for  implication  and  equivalence. 


(a) 

P=AP 

(b) 

C v  => 

9) 

A  (q 

=>  r) 

o  (po 

(c) 

(p  o 

9) 

(p  V  r 

4?Vr). 

(d) 

(p 

9) 

=> 

(p  A  r 

=>  gAr). 

(e) 
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(f) 
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Chapter  2 
Sets 


I  refuse  to  join  any  club  that  would  have  me  as  a  member. 

-  Groucho  Marx. 

Propositional  logic  allows  us  to  reason  about  the  world  by  inferring  new 
facts  from  facts  that  we  already  know.  However,  we  also  need  to  structure 
our  knowledge  by  grouping  things  together  and  by  relating  such  collections 
of  things  with  each  other.  In  the  parlance  of  Computer  Science,  we  don’t 
only  need  algorithms  that  process  information,  but  also  data  structures  that 
collect  and  store  it. 

There  are  many  words  in  English  for  describing  a  collection  of  things 
(especially  animals)  such  as:  a  pack  (of  wolves),  a  school  (of  fish),  a  gaggle 
(of  geese),  a  host  (of  angels),  a  den  (of  thieves),  a  crowd  (of  onlookers),  or  a 
fleet  (of  cars).  The  idea  of  regarding  a  collection  of  things  as  a  single  entity 
is  fundamental  in  mathematics  as  well  as  in  everyday  parlance.  However, 
mathematics  usually  restricts  itself  to  using  a  single  collective  noun:  set. 


(2^l)  Set  Notation 

A  set  is  a  collection  of  objects  which  typically  share  a  property.  The  objects 
belonging  to  the  collection  are  individually  referred  to  as  its  elements,  or 
members.  The  number  of  objects  in  a  set  A  is  referred  to  as  its  cardinality, 
and  is  written  \A\.  If  there  are  not  too  many  elements  in  the  set,  then  it  is 
most  typically  described  by  writing  its  elements  in  a  comma-separated  list 
between  curly  braces,  as  in  the  following  four  examples  of  sets: 

•  {  false,  true  }; 

•  {3,  7,  14}; 

•  {red,  blue,  yellow}; 

•  {Joel,  Felix,  Oskar,  Amanda}. 

•  {Aberystwyth,  Bangor,  Cardiff,  Lampeter,  Newport,  Swansea}; 
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The  above  sets  all  contain  a  small  number  of  elements  -  their  cardinalities  are 
2,  3,  3,  4  and  6,  respectively  -  and  as  such  are  easily  written  out.  Larger  sets 
which  aren’t  so  easily  written  out  explicitly  are  often  informally  described 
using  an  ellipsis  . .”,  as  in  the  following  three  examples: 

•  {  1,  3,  5,  . . . ,  99  }  (the  set  of  50  odd  positive  integers  below  100)\ 

•  {a,  b,  c,  . . . ,  z}  (the  set  of  26  letters  of  the  alphabet )] 

•  {  2,  3,  5,  7,  11,  13,  17,  ...  }  (the  infinite  set  of  prime  numbers). 

Though  we  shall  freely  use  this  notation,  it  is  generally  inadequate.  For 
example,  how  confident  are  you  that  the  final  set  above  denotes  the  set 
of  prime  numbers?  Having  an  infinite  number  of  elements,  it  would  be 
impossible  to  list  them  all  inside  curly  braces,  so  we  would  have  to  stop 
somewhere.  But  perhaps  the  next  element  we  have  in  mind  in  the  sequence 
after  17  is  21.  Perhaps  it  isn’t  even  a  number;  perhaps  the  next  element  in 
the  sequence  is  Groucho  Marx! 

To  avoid  any  ambiguity,  sets  are  typically  describe  not  by  explicitly  list¬ 
ing  the  elements  between  curly  braces,  but  rather  by  describing  the  property 
that  the  elements  share.  In  general,  we  shall  describe  sets  using  the  following 
set-builder  notation : 

{x  :  x  has  property  P}. 

That  is,  this  set  consists  of  exactly  those  objects  x  which  satisfy  the  prop¬ 
erty  P.  We  may,  of  course,  use  a  more  appropriate  variable  than  x. 


Example  2.1  j _ 

The  following  are  all  examples  of  sets: 

1.  The  collection  of  all  beaches  on  the  Gower  Peninsula: 

{  b  :  b  is  a  beach  on  the  Gower  Peninsula  }. 

2.  The  collection  of  all  people  who  have  climbed  Mount  Kailash: 

{p  :  p  has  climbed  Mount  Kailash  }. 

3.  The  collection  of  all  prime  numbers: 

{  n  :  n  is  a  prime  number  }. 

4.  The  collection  of  all  sets  of  people  who  have  a  common  grandmother: 

{  A  :  A  is  a  set  of  people  who  share  a  common  grandmother  }. 
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The  first  set  is  finite,  and  its  members  can  be  explicitly  listed  by  referring 
to  a  map  of  the  Gower  Peninsula.  The  second  set  -  as  far  as  we  know  -  has 
no  members.  The  third  set  has  infinitely  many  members,  and  so  could  not 
be  explicitly  listed.  The  members  of  the  fourth  set  are  themselves  sets. 


You  will  likely  be  familiar  with  many  standard  mathematical  sets  such 
as  the  following. 


0  =  {} 

(the  empty  set) 

B  =  {0,  1} 

(the  binary  digits,  or  bits) 

N  =  {0,  1,  2,  3,  ...  } 

(the  natural  numbers) 

Z  =  {  . . . ,  -3,  -2,  -1,  0,  1,  2,  3,  ...  } 

(the  integers) 

Q  =  {(£  :  m,  n  g  Z,  njtO} 

(the  rational  numbers) 

R  =  {  x  :  x  is  a  real  number  } 

(the  real  numbers) 

Note  that  0  and  {0}  are  different  sets;  the  set  0  contains  no  elements, 
while  the  set  {0}  contains  one  element,  namely  the  set  0  itself,  and  hence 
is  not  the  same  as  the  empty  set  0. 

Also  note  that  each  set  in  the  above  list  is  bigger  than  the  one  above  it, 

in  the  sense  that  it  includes  all  of  the  elements  of  the  set  above  it  plus  other 
elements  not  in  the  set  above. 


Q 


Exercise  2.1 


P)  (Solution  on  page  416) 


Write  out  the  following  sets  explicitly,  by  listing  their  elements  within  curly 
braces. 

1.  {x  :  a;  is  an  odd  integer  with  0  <  x  <  8  }. 

2.  {  x  :  a;  is  a  day  of  the  week  not  containing  the  letter  n  }. 

3.  {  x  :  x  was  a  wife  of  Henry  VIII }. 

4.  {a:  :  x  starred  as  James  Bond  in  the  official  series  of  films}. 


(Z2)  Membership,  Equality  and  Inclusion 

A  set  is  defined  solely  by  its  members,  so  clearly  the  most  basic  question 
we  can  pose  is  to  ask  if  an  object  a:  is  a  member  of  a  set  A.  Membership 
is  denoted  by  g,  pronounced  “is  an  element  (or  a  member)  of”,  as  for 
example  in 

7  g  {3,  7,  14}  (“7  is  an  element  of  the  set  {3,  7,  14}”), 
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or 

Felix  e  {Joel,  Felix,  Oskar}, 
whilst  non-membership  is  denoted  by  as  for  example  in 

8  ^  {3,  7,  14}  (“8  is  not  an  element  of  the  set  {3,  7,  14}”), 
or 

Amanda  ^  {  Joel,  Felix,  Oskar  }. 

That  is,  x  ^  A  is  the  same  as  ^(a;  e  A). 

Exercise  2.2J  (Solution  on  page  416) _ 

Write  out  the  following  sets  explicitly,  by  listing  their  elements  within  curly 
braces. 

1.  {x  :  a:  is  an  integer  with  x  =  2y  where  y  6  {  1,  2,  3,  4,  5  }  }. 

2.  {  x  :  a:  is  an  integer  with  2 x  —  y  where  y  6  {  1,  2,  3,  4,  5  }  }. 


(^Exercise  2.3^)  (Solution  on  page  416) _ 

Which  of  the  following  propositions  are  true? 

1.  2  e  {1,  2,  3}. 

2.  {2}  e  {1,  2,  3}. 

3.  {2}  6  {{1},{2},{3}}. 

4.  0  e  {}. 

5.  0  e  {0}. 


Since  a  set  is  defined  solely  by  its  members,  two  sets  are  equal  if,  and 
only  if,  they  have  the  same  elements.  So  when  you  list  the  elements  of  a 
set,  the  order  in  which  you  list  them,  and  the  number  of  times  you  list  each 
element,  doesn’t  matter.  Thus,  for  example, 

{3,  7,  14}  =  {7,  14,  3,  7,  3} 

while 


{Joel,  Felix,  Oskar}  {Joel,  Felix,  Oskar,  Amanda}. 

If  you  want  to  show  that  two  sets  are  different,  it  suffices  to  find  a  witness 
to  this  fact;  that  is,  an  element  of  one  set  which  is  not  in  the  other. 


(^Exercise  2.4^)  (Solution  on  page  416) 


Which  of  the  following  sets  are  equal? 
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•  A  =  {1,  {1,  2}} 

•  B  =  {  1,  {2}} 
•C  =  {  1,  {1}} 

.  D  =  {{1,1},  1} 

•  B  =  {  {2, 1},  1  } 


One  set  A  is  a  subset  of  another  set  B  if,  and  only  if,  each  element  of 
A  is  an  element  of  B;  in  such  a  case  we  write  A  C  B.  We  also  say  that 
A  is  included,  or  contained,  in  B;  or  that  B  is  a  superset  of  A,  written 
B  D  A;  or  that  B  includes,  or  contains,  A.  Reflecting  on  the  description 
of  equality  of  sets  above,  two  sets  A  and  B  are  thus  equal,  A  =  B,  if,  and 
only  if,  each  is  included  in  the  other: 

A  —  B  o  A  C  B  A  B  C  A; 

that  is,  if  any  element  of  one  is  an  element  of  the  other. 

As  further  notation,  we  write  A  g  B  to  denote  that  A  is  not  a  subset 
of  B,  that  is,  if  there  is  an  element  of  A  which  is  not  an  element  of  B.  In 
other  words,  A  g  B  is  the  same  as  -i (A  C  B ).  Finally,  we  write  A  C  B  if  A 
is  a  proper  subset  of  B,  that  is,  if  A  C  B  but  A  ^  B. 

(^Example  2.4^) 

As  already  noted  above,  the  binary  digits  form  a  proper  subset  of  the  natural 
numbers;  the  natural  numbers  form  a  proper  subset  of  the  integers;  the 
integers  form  a  proper  subset  of  the  rational  numbers;  and  the  rational 
numbers  form  a  proper  subset  of  the  real  numbers: 

0CBCNCZCQCR. 


A  useful  graphical  way  of  depicting  sets,  and  in  particular  the  relation¬ 
ship  between  them,  is  by  so-called  Venn  diagrams.  Such  a  diagram  is 
obtained  by  laying  out  the  elements  of  a  set  on  a  piece  of  paper  and  then 
encircling  them.  For  example,  we  can  depict  the  sets 

X  =  {1,  2,  3,4,5} 

Y  =  {2,  3,4} 

Z  =  {3,4,  5,6} 

by  the  Venn  diagram  in  Figure  2.1.  The  rectangle  represents  some  under¬ 
stood  universal  set  U,  referred  to  as  the  universe  of  discourse  consisting 
of  all  elements  under  consideration,  which  in  this  example  we  take  to  be  the 
integers  from  1  to  10: 
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U  —  { 1,  2,  3,  4,  5,  6,  7,  8,  9,  10}, 

and  the  sets  X,  Y  and  Z  are  represented  by  encircling  the  relevant  elements, 
depicted  in  the  Venn  diagram  in  Figure  2.1,  The  diagram  clearly  shows 
that  Y  C  X,  and  indeed  Y  CX,  since  1  e  X  but  1  ^  Y;  whereas  Z  is 
incomparable  to  both  X  and  Y:  X  g  Z  and  Z  g  X;  and  Y  g  Z  and 
Z  £Y. 

Furthermore,  it  is  clear  that  for  any  set  A:  0  C  A  and  A  C  U\  and  that 
for  any  sets  A,  B  and  C:  if  A  C  B  and  B  C  C  then  ACC. 


(^Exercise  2.5^)  (Solution  on  page  416) 

Which  of  the  following  propositions  are  true? 

1.  {  2  }  C  {  1,  2,  3  }. 

2.  {1,2,3}C{{1},{2},{3». 

3.  {{1,2}}  C  {{1,  2,3}}. 


As  a  final  observation,  we  can  note  the  following  special  properties  of  the 
subset  relation,  all  of  which  are  obvious  using  Venn  diagrams. 

1.  It  is  reflexive,  meaning  that  A  C  A  holds  for  every  set  A. 

2.  It  is  antisymmetric ,  meaning  that  if  A  C  B  and  B  C  A  then  A  =  B. 

3.  It  is  transitive,  meaning  that  if  A  C  B  and  B  C  C  then  ACC. 

Moreover,  the  empty  set  is  the  least  set  with  respect  to  inclusion;  that  is,  it 
is  contained  in  any  other  set:  0  C  A  holds  for  each  set  A. 
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(Z3)  Sets  and  Properties 

We  have  already  seen  that  listing  elements  is  not  appropriate  for  defining 
sets  with  infinitely  many  elements.  Instead  of  writing 

Primes  =  {2,3,  5,  7,  11,  13,  17,  19,  ...  } 

for  the  set  of  all  prime  numbers,  we  use  the  set-builder  notation 

Primes  =  {x  :  x  is  a  prime  number  }. 

to  define  Primes  as  the  set  of  all  objects  x  such  that  a;  is  a  prime  number. 
More  generally,  we  use  the  notation 

{x  :  x  has  the  property  P } 

to  indicate  that  we  are  building  (defining)  the  set  of  all  objects  x  which 
satisfy  the  property  P. 

This  set-builder  notation  is  typically  used  to  define  a  subset  B  of  an 
existing  set  A,  in  which  case  we  write: 

B  =  {  x  £  A  :  x  has  the  property  P  } 

instead  of 


B  =  {  x  :  x  e  A  and  x  has  the  property  P  } 

The  set-builder  notation  used  in  this  way  separates  the  objects  in  set  A 
which  satisfy  a  given  property  from  those  that  do  not. 


Philosophers  have  classified  humans  as  rational  animals  (albeit  a  reasonable 
rationality  criterion  might  be  to  disagree  with  this  classification).  Accord¬ 
ingly,  the  property  of  being  rational  separates  humans  from  all  other  animals; 
it  holds  of  all  humans,  and  of  no  other  animals.  Letting  Animals  denote  the 
set  of  all  animals  and  Humans  the  set  of  all  humans,  we  can  write 

Humans  =  {x  6  Animals  :  x  is  rational }. 

Thus,  x  e  Humans  if,  and  only  if,  x  e  Animals  and  x  is  rational. 


Given  two  real  numbers  a,  6  e  1  the  following  four  intervals  frequently  occur 
in  mathematics: 
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[a,  b]  =  {ieR 
(a,  b]  =  {ieR 
[a,  b)  =  {ieR 
(a,  b)  =  {ieR 
Given  two  integers  m,  n 


:  a  <  x  <  b  }; 

:  a  <  x  <  b  }; 

:  a  <  x  <  b  }; 

:  a  <  x  <  b  }. 

e  Z  the  interval  between  them  is  defined  as 


[m..n\  =  {fceZ  :  m<k<n}. 

In  all  of  the  above  intervals,  if  the  first  (left-hand)  value  is  greater  than  the 
second  (right-hand)  value,  then  the  interval  defined  is  the  empty  set  0. 


Example  2.7 j 

Obviously,  x  is  in  the  set 

{Joel,  Felix,  Oskar,  Amanda} 
if,  and  only  if, 

x  =  Joel  or  i  =  Felix  or  x  =  Oskar  or  x  =  Amanda, 

which  is  a  property  of  x.  Therefore,  the  above  set  can  be  rewritten,  some¬ 
what  tediously,  using  set-builder  notation  as: 

{x  :  x  =  Joel  or  x  =  Felix  or  x  =  Oskar  or  x  =  Amanda}. 


2.3.1  Russell’s  Paradox 

The  set-builder  notation  is  very  powerful;  however,  it  must  be  used  with 
some  care. 

We  have  seen  that  sets  can  contain  any  type  of  object,  including  sets 
themselves.  Normally  a  set  will  not  be  a  member  of  itself,  but  there  is  noth¬ 
ing  to  preclude  us  considering  abnormal  sets  that  are  elements  of  themselves. 
Consider,  then,  the  set  of  normal  sets:  those  sets  that  are  not  elements  of 
themselves;  this  set,  which  we  call  R,  can  be  defined  using  the  set-builder 
notation  as  follows: 

R  =  {A  \  A  £  A}. 

We  can  then  ask:  is  R  itself  a  normal  set?  That  is,  do  we  have  R  e  R?  Or 
do  we  have  R  £  R?  Certainly  one  of  these  two  must  be  true:  either  R  is  a 
normal  set,  or  it  isn’t. 

•  Suppose  that  R  e  R.  Then  R  must  satisfy  the  property  required  of 
being  an  element  of  R,  namely  we  must  have  that  R  ^  R. 
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•  Suppose  that  R  £  R.  Then  R  must  fail  to  satisfy  the  property  required 
of  being  an  element  of  R,  namely  we  must  not  have  that  R  ^  R;  that 
is,  we  must  have  that  R  e  R. 

By  the  Law  of  the  Excluded  Middle,  one  of  the  above  two  cases  must  hold. 
This  means  that  we  must  have  both  R  e  R  and  R  R;  that  is,  R  is  both  a 
normal  set  and  an  abnormal  set.  This  is  a  contradiction,  and  as  such  cannot 
be  true. 

This  anomaly  is  known  as  Russell’s  Paradox,  after  the  philosopher 
Bertrand  Russell  who  devised  it  to  demonstrate  the  need  to  be  vigilant 
in  how  you  define  sets.  In  particular,  it  should  not  be  possible  to  speak 
of  the  set  of  all  sets,  as  such  circularity  leads  directly  to  contradictions. 
Fortunately,  this  anomaly  cannot  arise  as  long  as  we  restrict  the  use  of  the 
set-builder  notation  to  the  restricted  form 

{  x  e  A  :  x  has  the  property  P  } 

in  which  we  define  the  set  as  a  subset  of  another  given  set  which  has  been 
previously  defined.  We  also  need  not  worry  about  using  the  general  set- 
builder  notation  if  we  have  an  implicit  underlying  universe  of  discourse. 

(^Exercise  2.7)  (Solution  on  page  416) _ 

Let  A  be  any  set,  and  define  the  set  R  by 
R  =  {X  eA  :  X  $X}. 

Do  we  now  have  R  e  R?  Or  do  we  have  R  £  R?  Why  is  Russell’s  Paradox 
not  a  problem  here? 
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In  the  previous  sections  we  have  seen  that  sets  can  be  constructed  directly  by 
putting  curly  braces  around  a  listing  of  its  elements,  or  indirectly  using  the 
set-builder  notation.  In  this  section  we  will  consider  a  variety  of  operations 
which  can  be  used  to  construct  new  sets  from  old. 


2.4.1  Union 

The  union  A  U  B  of  two  sets  A  and  B  consists  of  exactly  those  elements  of 
the  universe  of  discourse  which  are  in  either  A  or  B  (or  both): 

ALS  B  =  { x  :  i  e  i  or  i  6  B }. 


Thus, 
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xeAuB  <>  xeAvxeB. 


This  is  depicted  by  the  following  Venn  diagram,  where  the  gray  area  repre¬ 
sents  luB. 


The  union  of  the  set  of  people  who  can  speak  English  and  the  set  of  people 
living  in  Prance  is  the  set  of  people  who  can  either  speak  English  or  who 
live  in  Prance  (or  both). 


2.4.2  Intersection 

The  intersection  A  n  B  of  two  sets  A  and  B  consists  of  exactly  those 
elements  of  the  universe  of  discourse  which  are  in  both  A  and  B : 

AnB  =  {x  :  x  6  A  and  x  G  B  }. 

Thus, 

xeAnB  o  xeAAxeB. 


This  is  depicted  by  the  following  Venn  diagram,  where  the  gray  area  repre¬ 
sents  AnB. 


{1,  2,  3,  4,  5}  n  {2,  4,  6,  8,  10}  =  {2,  4} 


Operations  on  Sets  67 


(^Example  2.11?) _ 

The  intersection  of  the  set  of  people  who  can  speak  English  and  the  set  of 
people  living  in  Prance  is  the  set  of  people  living  in  France  who  can  speak 
English. 


Two  sets  A  and  B  are  said  to  be  disjoint  if  they  have  no  elements  in 
common;  that  is  to  say,  if  their  intersection  is  empty:  A  n  B  =  0.  In  terms 
of  Venn  diagrams,  this  means  that  the  regions  depicting  A  and  B  do  not 
overlap. 

There  will  typically  be  fewer  elements  in  the  union  of  two  finite  sets  A 
and  B,  \AU  B |,  than  \A\  +  |B|;  the  whole  will  generally  be  less  than  the 
sum  of  the  parts.  This  is  due  to  the  fact  that  |j4|  +  |B|  counts  the  members 
of  the  intersection  An  B  twice.  To  balance  this,  we  have  the  the  following 
principle. 

(^Theorem  2.11J)  Inclusion-Exclusion  Principle _ 

For  finite  sets  A,  B  and  C :  \AU  B\  =  \A\  +  \B\  —  |j4nS|. 


2.4.3  Difference 

The  difference  A\B  of  two  sets  A  and  B  consists  of  exactly  those  elements 
of  the  universe  of  discourse  which  are  in  A  but  not  in  B: 

A\B  =  {x  e  A  :  x  £  B}. 

Thus, 

xeA\B  o  x  e  A  A  x  B. 

This  is  depicted  by  the  following  Venn  diagram,  where  the  gray  area  repre¬ 
sents  A\B. 


^Example  2.12^) 

{1,  2,  3,  4,  5}  \  {2,  4,  6,  8,  10}  =  {1,  3,  5}, 


and 
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{2,  4,  6,  8,  10}  \  {1,  2,  3,  4,  5}  =  {6,  8,  10}. 


(^Example  2.13^) 

The  difference  of  the  set  of  people  who  can  speak  English  and  the  set  of 
people  living  in  France  is  the  set  of  English-speaking  people  who  do  not  live 
in  Prance. 

Conversely,  the  difference  of  the  set  of  people  living  in  France  and  the  set 
of  people  who  can  speak  English  is  the  set  of  non-English-speaking  people 
living  in  France. 


2.4.4  Complement 

The  complement  A  of  a  set  A  is  the  set  consisting  of  exactly  those  elements 
of  the  universe  of  discourse  which  are  not  elements  of  A: 

A  =  {x  :  x  ^  A}. 

Thus, 


x  £  A  x  ^  A. 

The  set  A  is  thus  the  same  as  U  \  A,  and  is  depicted  by  the  following  Venn 
diagram,  where  the  gray  area  represents  A. 


Assuming  the  universe  of  discourse  is  U  =  { 1,  2,  3,  4,  5,  6,  7,  8,  9,  10  }, 


{1,  2,  3,  4,  5}  =  {6,  7,  8,  9,  10}, 

and 

{2,  4,  6,  8,  10}  =  {1,3,  5,  7,  9}. 


(^Example  2.1fT) _ 

Assuming  the  universe  of  discourse  is  the  set  of  people  in  the  world,  the 
complement  of  the  set  of  people  who  can  speak  English  is  the  set  of  non- 
English-speaking  people;  and  the  complement  of  the  set  of  people  living  in 
France  is  the  set  of  people  who  do  not  live  in  Prance. 
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(^Exercise  2.15)  (Solution  on  page  416) 

Consider  the  following  sets: 

U  =  {1,  2,  3,  4,  5,  6,  7,  8,  9,  10},  (the  universe  of  discourse) 

A  =  {1,  3,  5,  7,  9}, 

B  =  {3,  4,5}, 

C  =  {5,  6,  7,  8,  9}. 

Draw  a  Venn  diagram  depicting  these  sets,  and  compute  the  following  sets: 

1.  AnC. 

2.  (AnB'juC 

3.  An(BuC) 

4.  {AuB)\C. 

5.  (AajB)  n  C. 


(^Exercise  2.1(T)  (Solution  on  page  417) _ 

Let  A,  B  and  C  be  sets. 

1.  If  A  C  B,  what  can  you  say  about  ALS  B  and  An  B? 

2.  If  A  C  B,  what  can  you  say  about  A  and  B ? 

3.  What  is  A,  the  complement  of  the  complement  of  j4? 

4.  If  C  C  A  and  C  C  B,  how  is  C  related  to  A  n  B1 

5.  If  A  C  C  and  B  C  C,  how  is  C  related  to  A  U  B! 


2.4.5  Powerset 

The  powerset  V  (j4)  of  a  set  A  is  the  set  consisting  of  all  subsets  of  A: 
V{A)  =  {X  :  X  C  A}. 

Thus, 

x  eV  (A)  o  x  C  A. 

In  particular,  0  e  V  (A)  and  A  e  V  (A). 

We  might  only  be  interested  in  finite  subsets.  In  this  case  we  shall  denote 
by  Vhn  (A)  the  set  consisting  of  all  finite  subsets  of  A: 


■Pfin  (A)  =  {X  :  X  C  A  and  X  is  finite  }. 
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1.  The  set  {0,  1}  has  four  subsets: 


V({0,1})  =  {0,  {0},  {1},  {0,1}}. 

More  specifically,  there  are  the  following  subsets: 

•  one  subset  with  no  elements  (the  empty  set); 

•  two  singleton  subsets  (one  for  each  element  in  the  set);  and 

•  one  subset  with  two  elements  (the  whole  set  itself). 

2.  The  set  {cola,  fanta,  sprite}  has  eight  subsets: 

V  ({  cola,  fanta,  sprite  }) 

=  {0, 

{cola},  {fanta},  {sprite}, 

{cola,  fanta},  {cola,  sprite},  {fanta,  sprite}, 

{cola,  fanta,  sprite}}. 

More  specifically,  there  are  the  following  subsets: 

•  one  subset  with  no  elements  (the  empty  set); 

•  three  singleton  subsets  (one  for  each  element  in  the  set); 

•  three  subsets  with  two  elements  (one  for  each  element  left  out); 
and 

•  one  set  with  three  elements  (the  whole  set  itself). 

3.  The  set  {Joel,  Felix,  Oskar,  Amanda}  has  16  subsets: 

"P({Joel,  Felix,  Oskar,  Amanda}) 


{Joel},  {Felix},  {Oskar},  {Amanda}, 

{Joel,  Felix},  {Joel,  Oskar},  {Joel,  Amanda}, 

{Felix,  Oskar},  {Felix,  Amanda},  {Oskar,  Amanda}, 

{Joel,  Felix,  Oskar},  {Joel,  Felix,  Amanda}, 

{Joel,  Oskar,  Amanda},  {Felix,  Oskar,  Amanda}, 

{Joel,  Felix,  Oskar,  Amanda}}. 

More  specifically,  there  are  the  following  subsets: 

•  one  subset  with  no  elements  (the  empty  set); 

•  four  singleton  subsets  (one  for  each  element  in  the  set); 

•  six  subsets  with  two  elements  (one  for  each  pair); 
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•  four  subsets  with  three  elements  (one  for  each  element  left  out); 
and 

•  one  set  with  four  elements  (the  whole  set  itself). 

4.  In  general,  if  |A|  =  n  then  [P  (A)|  =  2":  a  set  with  n  elements  has  2" 
different  subsets. 


(^Example  2.17^) _ 

Amanda  has  invited  the  following  six  friends  to  her  birthday  party:  Daniel, 
Ella,  Mia,  Rhodri  and  Zoe.  However,  some  of  them  might  not  show  up.  If 
we  let 


Friends  =  {Daniel,  Ella,  Mia,  Rhodri,  Zoe} 

then  the  collection  of  combinations  of  friends  that  might  come  to  Amanda’s 
party  is  given  by  V  (Friends).  For  example,  perhaps  Ella  and  Rhodri  are 
busy  that  day,  but  the  others  all  come;  then  the  set  of  friends  that  come  to 
Amanda’s  party  is: 

{  Daniel,  Mia,  Zoe  }  £  V  (Friends). 


(^Exercise  2.17^)  (Solution  on  page  417) 


List  the  elements  of  V  (Friends),  where  Friends  is  the  set  defined  in  Exam¬ 
ple  2.17  above.  How  many  sets  of  each  size  are  there? 


Exercise  2.18 J  (Solution  on  page  418) _ 

Form  the  following  sets  from  the  empty  set  0: 

1.  the  set  A  =  V  (0); 

2.  the  set  B  —  V  (A); 

3.  the  set  C  =  V  ( B ). 

How  many  elements  are  in  each  of  these  sets? 


(^Exercise  2.19^)  (Solution  on  page  418) 


Given  an  arbitrary  set  A,  what  are  V  (A)  n  0  and  V  (A)  n  {  0  }? 
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'k  2.4.6  Generalised  Union  and  Intersection 

It  makes  perfect  sense  to  take  the  union  or  intersection  of  any  number  of 
sets,  not  just  two.  For  example,  we  can  consider  the  union 

A  U  B  U  C 

of  three  sets  A,  B  and  C,  meaning  the  set  whose  elements  are  those  objects 
which  are  members  of  any  of  the  sets  A,  B  or  C\  or  the  intersection 

AnBnCnDnE 

of  five  sets  A,  B,  C,  D  and  E,  meaning  the  set  whose  elements  are  those 
objects  which  are  members  of  all  of  the  sets  A,  B,  C,  D  and  E.  We  don’t 
have  to  worry  about  which  order  we  take  the  sets;  for  example,  the  set 
A  U  (B  U  C)  is  clearly  the  same  as  (C  u  A)  u  B.  This  is  because  the  union 
and  intersection  operations  are  associative: 

AU(BUC)  =  {A  U  B)  U  C  and  A  n  (B  n  C)  =  (4nB)nC; 

and  commutative: 

j4uB  =  Bui  and  iuB  =  BuA. 

In  fact,  we  can  extend  union  and  intersection  to  apply  to  arbitrary  fam¬ 
ilies  (sets)  of  sets:  if  T  is  a  set  of  sets,  then 

U-F  =  {x  :  x  £  A  for  some  i  6  T} 

f| T  —  {x  :  x  e  A  for  all  A  e  T } 

In  particular,  AuB  =  U  {A,  B }  and  AnB  =  f|  {A,  B }.  With  a  little  thought, 
the  following  identities  become  apparent: 

1.  A  —  [j  {A}  and  A  =  fl  {4}. 

2.  A  =  UP(A)  and  0  =  DV(A). 

3.  0  =  U  0  and  U  =  f|  0,  where  U  is  the  universe  of  discourse. 

The  final  two  identities  are  worth  further  explanation.  By  definition,  x  g  (J  0 
if,  and  only  if,  x  e  A  for  some  A  e  0;  but  since  there  can  be  no  such  A  e  0, 
there  can  be  no  such  x  6  A. 

Similarly,  by  definition,  x  G  f|  0  if,  and  only  if,  x  e  A  for  all  A  G  0;  but 

since  there  can  be  no  such  A  e  0,  it  is  vacuously  true  that  x  e  A  for  each  of 

these  A  g  0 . 

(^Example  2.19^) _ 

Suppose,  e.g.,  that  CS101  is  the  set  of  all  students  enrolled  on  the  course 
Computer  Science  101,  and  that  ClassLists  is  the  set  of  all  class  lists,  so 
that,  for  example,  CS101  6  ClassLists.  Then  the  set  Students  of  all  students, 
that  is,  all  people  who  are  enrolled  on  some  course,  would  be 
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Students  =  U  ClassLists. 

The  set  f|  ClassLists  would  likely  be  empty,  as  it  would  contain  those  students 
who  are  enrolled  on  all  courses. 

(Exercise  2.2o()  (Solution  on  page  418) 

Given  an  arbitrary  set  A,  what  are  fl'Pfin  (^4)  and  U'Pfin  (^4)? 
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An  ordered  pair  is  simply  a  pair  of  objects  (a,  b )  with  first  coordinate  a 
and  second  coordinate  b.  For  example,  points  in  the  a:t/-plane  are  denoted 
by  ordered  pairs;  the  ordered  pair  (4,  9),  for  example,  denotes  the  point  with 
^-coordinate  4  and  ^-coordinate  9.  The  ordered  pair  (a,  b )  is  different  from 
the  set  {a,  b}  in  that  it  is  ordered;  (a,  b)  (b,  a)  (unless,  of  course,  a=b), 
whereas  {a,  6}  =  { b ,  a}.  More  precisely, 

(a,  b )  =  (c,  d )  if,  and  only  if,  a  =  c  and  b  =  d. 

The  Cartesian  product  A  x  B  of  two  sets  A  and  B  is  the  set  of  all 
ordered  pairs  in  which  the  first  coordinate  a  is  an  element  of  A  and  the 
second  coordinate  b  is  an  element  of  B. 

A  x  B  =  {  (a,  b)  :  a  £  A  and  b  G  B  }. 

Thus, 

( a,b)eAxB  o  aeA  A  beB. 

For  example,  IxR,  typically  written  as  R2 ,  denotes  the  set  of  points  in  the 
a:i/-plane. 


(Example  2.2(T) _ 

The  Cartesian  product  [l..m]  x  [1  ..n]  of  the  intervals  [l..m]  and  [1  ..n]  can 
model  a  finite  grid,  such  as  the  points  of  an  LCD  screen  or  the  squares  on  a 
chess  board. 


(1,1) 

(1,2) 

'•  (l,n 

(2,1) 

(2,2) 

'•  (l,n 

(m,  1)  (m,  2)  •  •  •  (m,  n) 
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(^Example  2.21 ^) _ 

Many  programming  languages  offer  abstract  data  types  that  allow  you  to 
store  and  retrieve  data  using  key-value  pairs.  These  data  types  have  dif¬ 
ferent  names  in  different  programming  languages,  such  as  associative  array, 
dictionary,  map,  or  table.  But  key-value  pairs  are  always  ordered  pairs  from 
a  Cartesian  product  Keys  x  Values,  where  Keys  is  a  set  of  keys  and  Values  is 
a  set  of  values.  The  values  are  the  pieces  of  information  that  are  stored  in 
the  data  type,  and  the  keys  -  which  are  unique  for  each  value  -  allow  you 
to  retrieve  the  value. 

As  an  example,  we  may  have  a  national  database  in  which  each  person  is 
assigned  a  unique  identification  number.  In  this  case,  names  serve  as  keys 
and  the  values  are  the  identification  numbers  associated  with  each  name: 

IDNumbers  =  {  (Joel,  7613), 

(Felix,  8217), 

(Oskar,  6457), 

(Amanda,  9601), 

}. 

As  another  example,  a  correspondence  can  be  made  between  countries  and 
their  capital  cities: 

CapitalCities  =  {  (France,  Paris), 

(Peru,  Lima), 

(Japan,  Tokyo), 

(Mali,  Bamako), 

}. 


We  can  form  the  Cartesian  product  of  any  number  n  e  N  of  sets,  whose 
elements  are  n-tuples.  For  example 

A  x  B  x  C  =  { ( a,b,c )  :  a  £  A,  b  g  B  and  cgC} 

represents  the  set  of  triples  (a,  b,  c)  in  which  the  first  coordinate  a  is  an  ele¬ 
ment  of  A,  the  second  coordinate  b  is  an  element  of  B,  and  the  third  coordi¬ 
nate  c  is  an  element  of  C.  In  general,  we  write  An  to  denote  A  x  A  x  ■  ■  ■  x  A, 
that  is,  the  Cartesian  product  of  n  copies  of  the  set  A.  Three-dimensional 
space,  thus,  is  defined  by  I3  =  1  x  R  x  R. 

Note  that  the  number  of  elements  in  a  product  is  the  product  of  the 
number  of  elements  in  the  individual  sets.  In  particular,  for  any  set  A, 

A  x  0  =  0  =  0  x  A. 
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(^Example  2.22^ _ 

Let  5  represents  all  students,  C  represents  all  courses,  and  G  represents 
possible  grades.  Then  S  x  C  x  G  represents  all  triples  ( s,c,g )  where  s  e  S 
is  a  student,  c  e  C  is  a  course  and  g  e  G  is  a  grade.  A  University  student 
database  would  be  represented  as  a  subset  of  this  set,  recording  the  grades 
for  all  students  registered  in  each  course. 


(^Example  2.23^) _ 

A  pixel  is  a  point  on  a  computer  screen,  and  these  are  laid  out  in  a  rect¬ 
angular  grid  [l..h]  x  as  in  Example  2.20,  with  the  number  of  pixels 
dependent  on  the  size  and  resolution  of  the  screen. 

Each  pixel  is  displayed  as  a  dot  of  a  certain  colour.  In  the  RGB  model, 
a  colour  is  specified  by  a  triple 

(r,  g,  b )  G  [0,  l]3  where  0, 1  -■  {x  (.  ~i  :  0  <  a;  <  1  } 

representing  an  intensity  of  red,  green  and  blue,  respectively,  with  0  being 
no  intensity  at  all  and  1  being  maximum  intensity.  For  example,  black  is 
represented  by  (0,0,0)  (no  colours)  while  white  is  represented  by  (1,1,1) 
(maximum  intensity  of  all  colours);  and  red,  green  and  blue  are  obviously 
represented  by  (1,  0,  0),  (0, 1,  0)  and  (0,  0, 1),  respectively.  We  can  thus  define 
the  following  two  sets: 

Pixel  =  [l..h]  x  [l..w]  and 
Colour  =  [0,  l]3, 

and  use  them  to  define  a  point  on  the  screen  as  a  member  of  the  set 
Point  =  Pixel  x  Colour 

which  assigns  a  colour  to  a  pixel.  Each  point  is  therefore  represented  by 
an  ordered  pair  ((x,y),  ( r,g,b ))  whose  first  coordinate  is  the  ordered  pair 
(x,  y ),  and  whose  second  coordinate  is  the  ordered  triple  (r,  g,  b ). 


(^Exercise  2.23^)  (Solution  on  page  418) _ 

Every  rational  number  can  be  represented  as  an  ordered  pair  of  integers. 
The  number  3/4,  for  example,  corresponds  to  the  ordered  pair  (3,  4).  Define 
the  operations  of  addition  and  multiplication  on  ordered  pairs  of  integers 
such  that  they  correspond  to  the  standard  operations  on  fractions. 
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Modelling  with  Sets 


As  the  fundamental  data  structures  of  mathematics,  sets  inevitably  occur  in 
the  specifications  of  systems.  In  many  cases,  sets  capture  system  properties 
more  concisely  than  propositional  logic.  In  this  section,  we  explore  a  number 
of  examples,  starting  with  revisiting  Amos  Judd. 


(^Example  2.24^) 

Consider  the  following  three  assumptions: 

1.  All  candy  has  sugar. 

2.  John  eats  only  healthy  foods. 

3.  No  healthy  food  contains  sugar. 

We  can  reason  about  these  assumptions  by  introducing  the  sets  H,  S,  J  and 
C  to  represent,  respectively,  the  set  of  healthy  foods,  the  set  of  sugary  foods, 
the  set  of  foods  that  John  eats,  and  the  set  of  candy.  The  above  assumptions 
can  be  expressed,  equationally  and  with  a  Venn  diagram,  as  follows: 

1.  CCS 

2.  J  C  H 

3.  5  n  H  =  0 


From  this  picture  it  is  clear  that  no  candy  is  healthy,  and  as  such  that  John 
doesn’t  eat  candy. 


(^Exercise  2.24)  (Solution  on  page  418) 

Recall  the  situation  regarding  Amos  Judd  from  Exercise  1.14  (page  33),  in 
which  the  fact  that  Amos  Judd  loves  cold  mutton  could  be  inferred  from 
the  following  assumptions: 

1.  All  the  policemen  on  this  beat  sup  with  our  cook. 

2.  No  man  with  long  hair  can  fail  to  be  a  poet. 

3.  Amos  Judd  has  never  been  in  prison. 

4.  Our  cook’s  cousins  all  love  cold  mutton. 

5.  None  but  policemen  on  this  beat  are  poets. 

6.  None  but  her  cousins  ever  sup  with  our  cook. 

7.  Men  with  short  hair  have  all  been  in  prison. 
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Demonstrate  how  to  solve  this  problem  by  reasoning  about  appropriately- 
defined  sets. 


Exercise  2.25 J  (Solution  on  page  419) 

Another  one  of  Lewis  Carroll’s  famous  puzzles  has  the  following  premises: 


All  babies  are  illogical. 

Nobody  is  despised  who  can  manage  a  crocodile. 
Illogical  persons  are  despised. 


Use  an  appropriate  Venn  diagram  to  deduce  from  these  premises  that  no 
baby  can  manage  a  crocodile. 


Exercise  2.26)  (Solution  on  page  420) 


Use  an  appropriate  Venn  diagram  to  determine  whether  or  not  the  following 
argument  is  valid. 


All  oceans  are  full  of  water. 

No  ponds  are  oceans. 

Therefore  no  ponds  are  full  of  water. 


At  a  certain  hospital,  40  patients  each  have  at  least  one  of  the  following 
symptoms:  a  fever,  a  sore  throat,  or  an  earache.  18  of  them  have  an  earache 
and  25  of  them  have  a  sore  throat,  while  eight  of  them  have  both  an  earache 
and  a  sore  throat.  Of  the  fever  sufferers,  11  of  them  have  sore  throats,  nine 
have  earaches,  and  two  have  both  a  sore  throat  and  an  earache.  How  many 
fever  sufferers  are  there? 

We  can  use  a  Venn  diagram  to  solve  this  problem,  by  drawing  the  three 
sets  of  patients  as  follows: 
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The  question  marks  represent  the  numbers  of  patients  in  the  relevant  sub¬ 
sets,  and  these  numbers  must  add  up  to  40.  We  merely  need  to  replace  these 
question  marks  with  the  relevant  numbers  based  on  the  information  given 
in  the  problem,  which  we  can  do  by  working  from  the  inside  out. 

•  We  first  put  a  2  in  the  intersection  of  all  three  sets,  depicting  the  two 
patients  who  are  suffering  from  all  three  symptoms. 

•  Since  eight  patient  are  suffering  from  both  a  sore  throat  and  an  earache, 
six  of  these  must  not  have  fever,  so  we  can  put  a  6  in  the  relevant  place 
in  the  diagram. 

•  Next,  there  are  11  fever  sufferers  who  have  a  sore  throat;  we  know  that 
two  of  these  also  have  an  earache,  so  nine  of  these  must  not  have  an 
earache,  so  we  can  thus  put  a  9  in  the  relevant  place  in  the  diagram. 

•  Also,  there  are  nine  fever  sufferers  who  have  an  earache;  we  know  that 
two  of  these  also  have  a  sore  throat,  so  seven  of  these  must  not  have  a 
sore  throat,  so  we  can  thus  put  a  7  in  the  relevant  place  in  the  diagram. 

•  There  are  18  patients  with  earaches,  15  of  which  have  other  symptoms; 
thus  three  have  no  other  symptoms,  so  we  can  put  a  3  in  the  relevant 
place  in  the  diagram. 

•  There  are  25  patients  with  sore  throats,  17  of  which  have  other  symp¬ 
toms;  thus  eight  have  no  other  symptoms,  so  we  can  put  an  8  in  the 
relevant  place  in  the  diagram. 

•  As  there  are  40  patients  in  total,  and  35  are  accounted  for  as  having 
either  a  sore  throat  or  an  earache,  there  are  five  patients  who  only  suffer 
from  fever,  so  we  can  put  a  5  in  the  relevant  place  in  the  diagram. 

The  Venn  diagram  thus  looks  as  follows: 


Note  that  this  is  not  a  Venn  diagram  in  the  usual  sense:  the  elements 
of  the  universe  are  not  the  numbers  {  2,  3,  5,  6,  7,  8,  9  }.  Rather,  the  5,  for 
example,  represents  five  elements  of  the  set  of  patients  who  are  suffering 
only  from  fever.  The  Venn  diagram  would  more  rightly  look  something  like 
the  following: 
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Here,  each  dot  represents  a  distinct  patient.  However,  the  original  Venn 
diagram,  with  just  the  numbers,  is  far  easier  to  read. 

With  this,  a  final  simple  count  tells  us  that  23  patients  suffer  from  fever. 


(2/7)  Algebraic  Laws  for  Set  Identities 

We  can  often  represent  the  same  set  in  a  variety  of  ways;  for  example,  we’ve 
already  noted  that  it  doesn’t  matter  whether  we  write  Au  B  or  B  U  A  as 
these  give  the  same  set.  In  this  section  we  list  a  variety  of  identities,  which 
will  allow  us  to  reason  algebraically  about  sets.  All  of  the  laws  presented 
can  be  verified  informally  by  considering  the  appropriate  Venn  diagrams. 

Commutativity  Laws 

AuB  =  Bui  inB  =  Bni 

Associativity  Laws 

iu(BuC)  =  (iuB)uC  in(BnC)  =  (inB)nC 

Ldempotence  Laws 

AuA  =  A  An  A  =  A 

Distributivity  Laws 

iu(BnC)  =  (iuB)n(iuC)  in(BuC)  =  (inB)u(inC) 

De  Morgan’s  Laws 

(AuB)  =  inB  (inB)  =  iuB 


Double  Complement  Law 
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A  =  A 


Universe  Laws 
AuU  =  U 

Empty  Set  Laws 
y4u0  =  A 

Complement  Laws 
,4UZ  =  U 

Absorption  Laws 
A  U  {A  n  B)  =  A 


AnU  =  A 


A  n  0  =  0 


A  n  A  =  0 


A  n  {A  u  B)  =  A 


You  can  (and  should!)  convince  yourself  of  all  of  the  above  identities  by 
constructing  appropriate  Venn  diagrams. 


(^Exercise  2.27)  (Solution  on  page  420) 

Draw  the  Venn  diagrams  which  justify  the  two  Distributive  Laws. 


We  can  use  the  above  identities  to  derive  even  more  identities,  bypassing 
the  need  to  construct  Venn  diagrams  to  justify  them. 

(^Example  2.27) 

We  can  derive  the  identity  Ali(AC\B)  =  AliB  using  the  following  sequence 
of  steps: 


i  u  (InB)  =  {A  u  A)  n  {A  u  B) 

( Distributivity ) 

=  U  n  (AyjB) 

(  Complement) 

=  {Au  B)  n  U 

(  Commutativity) 

=  AUB 

(Universe) 

Exercise  2.28^)  (Solution  on  page  420) 

Give  a  derivation  of  the  identity  An(Ali  B) 

—  An  B. 

The  above  laws  allow  us  to  reason  about  set  inclusions  as  well  as  identi¬ 
ties,  by  observing  first  that  the  set  inclusion  X  C  Y  can  be  expressed  as  a 
set  identity,  in  any  of  the  following  ways: 


X  UY  =  Y,  XnY  =  X,  X\Y  =  <H,  XuY  =  U. 
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That  each  of  the  above  are  equivalent  to  the  proposition  that  X  C  Y  can 
be  readily  be  checked  by  considering  the  appropriate  Venn  diagram: 


We  can  derive  the  new  law  A  C  A  u  B  as  follows: 

•  By  Associativity  and  Idempotence,  A  u  (AUB)  =  AH  B. 

•  Letting  X  =  A  and  Y  =  A  U  B,  this  says  that  X  UY  =  Y . 

•  By  the  above,  this  means  that  ICY;  that  is,  that  iCiuB. 


* 


Logical  Equivalences  versus  Set  Identities 


The  astute  reader  will  have  noticed  that  there  is  a  direct  correspondence 
between  the  Equivalence  Laws  for  Propositional  Logic  from  Section  1.7  and 
the  Set  Identities  from  the  previous  Section  2.7.  For  convenience,  these  laws 
are  listed  once  again  here,  side-by-side. 


Commutativity  Laws 
Pv Q  o  Q v P 
P  A  Q  O  Q  A  P 


Commutativity  Laws 
An  B  =  Bui 
An  B  =  B  n  A 


Associativity  Laws  Associativity  Laws 

PV  (Qv  R)  o  (PV  Q)V  R  Au(BuC)  =  (4uB)uC 

Pa(QaP)  o  (PaQ)aP  An(BnC)  =  (AnB)nC 


Ldempotence  Laws 
P  V  P  o  P 


Ldempotence  Laws 
An  A  =  A 
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P  A  P  O  P 

Al  n  ;4  =  ;4 

Distributivity  Laws 

Pv (Q /\R)  o  (PVQ)A(PVP) 

Pa (Qv R)  o  (PaQ)v(PaP) 

Distributivity  Laws 

Au(BnC)  =  [AuB)n{A\JC) 

An(BuC)  =  (infi)u(inC) 

De  Morgan’s  Laws 

-|(P  V  Q)  O  ~.p  A  -iQ 

-i(P  A  Q)  O  nP  V  nQ 

De  Morgan’s  Laws 

(4UB)  =  4  n  B 

(AnB)  =  A  U  B 

Double  Negation  Law 

->-> P  o  P 

Double  Complement  Law 

A  =  A 

Tautology  Laws 

P  V  true  o  true 

P  A  true  o  P 

Universe  Laws 

AuU  =  U 

AnU  =  A 

Contradiction  Laws 

P  V  false  <t=>  P 

P  A  false  o  false 

Empty  Set  Laws 

AlU0  =  A 

A  n  0  =  0 

Excluded  Middle  Laws 

PVnP  «  true 

P  a  ^P  o  false 

Complement  Laws 

=  U 

AnA  =  0 

Absorption  Laws 

P  V  (P  A  Q)  O  P 

P  A  (P  V  Q)  O  P 

Absorption  Laws 

Au(An  B)  =  A 

A  n  (4U  B)  =  A 

Each  law  of  equivalence  for  propositions  gives  rise  to  a  set  identity  by 
replacing  V  by  U,  A  by  n,  and  —  by  •  (as  well  as  false  by  0  and  true  by 
U).  This  exploits  a  tight  analogy  between  logical  equivalence  P  -o  Q  and 
equality  of  sets  A  =  B,  which  can  be  extended  to  logical  implication  P  =>  Q 
and  subset  inclusion  A  C  B  as  in  the  following  example. 
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(^Example  2.2fP) _ 

The  Implication  Law  from  Section  1.7: 

P  =>  Q  O  nPVQ 
gives  rise  to  the  following  property  of  sets: 

AC  B  if,  and  only  if,  A  U  B  =  U. 

This  property  is  arrived  at  by  translating  P  =>  Q  into  A  C  B,  and  expressing 
-i P  V  Q  as  -i P  V  Q  o  true  before  translating  it  in  the  above  fashion. 
(The  equivalence  symbol  itself  is  translated  merely  into  English.) 


(^Exercise  2.30)  (Solution  on  page  421) _ 

Find  properties  of  sets  corresponding  to  the  following  laws  for  propositions 
taken  from  Section  1.7. 

1.  Contrapositive  Law:  P  =>  Q  o  -i Q  =k  ~>P. 

2.  Equivalence  Law:  P  <S=>  Q  O  (P  =>  Q)  A  (Q  =>  P). 


Although  the  analogy  between  propositions  and  sets  is  tight,  care  must 
be  taken  when  trying  to  use  it.  You  should  always  check  the  validity  of  a 
property  of  sets  which  is  so  derived,  for  example  by  considering  the  relevant 
Venn  diagrams. 

(^Exercise  2.31))  (Solution  on  page  421) 

What  property  of  sets  is  suggested  by  the  following  law  for  propositions: 
^(P  Q)  O  P  A  ~^Q 

If  you  do  this  exercise  carefully,  you  may  well  arrive  at  a  property  which  is 
generally  not  true  of  sets.  This  exercise  thus  serves  to  point  out  that  it  is 
dangerous  to  rely  on  informal,  intuitively-correct  arguments. 
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1.  Let  A  =  {  1,  2,  3,  4,  5  },  B  =  {  4,  5,  6,  7,  8,  9  },  and  C  =  {  2,  4,  6,  8  }. 


What  are  the  following  sets? 


(a)  AUBUC. 

(b)  AnBnC. 
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(c)  (AnB)uC. 

(d)  An(BuC). 

2.  What  sets  are  defined  by  {  x  :  x  x  }  and  {x  e  A  :  x  =  x}? 

3.  Draw  Venn  diagrams  to  justify  the  two  De  Morgan  Laws  A  U  B  —  AnB 
and  AnB  =  AuB. 

4.  Draw  the  Venn  diagrams  which  justify  the  following  laws. 

(a)  (AUB)\C  =  ( A\C )  U  {B\C). 

(b)  A  n  (B\C)  =  (An  B)\(AnC). 

(c)  4\(BnC)  =  (A  \  B)  U  (A\C). 

(d)  (A\B)\C  =  A\(BuC). 

(e)  A  U  (B\C)  =  A  U  ((4UB)\(4UC)). 

5.  What  can  you  say  about  the  sets  A  and  B  if  we  know  the  following  to 
be  true? 

(a)  A  U  B  =  A. 

(b)  A  n  B  =  A. 

(c)  A  \  B  =  A. 

(d)  A\B  =  B\  A. 

6.  Form  the  following  sets  from  the  set  A  =  {  a  }: 

(a)  the  set  B  —  V  (^4); 

(b)  the  set  C  =  V(B)\ 

(c)  the  set  D  =  V  (C). 

7.  Let  A  —  {  1,  {2,  3},  {4,  5,  {6}}}. 

(a)  What  is  V  (A)? 

(b)  State  whether  the  following  are  true  or  false. 

i.  0  e  A. 

ii.  1  e  A. 

iii.  {2,  3}  C  A. 

iv.  {{2,  3}}  C  A. 

v.  {4,  5,  {6}}C  A. 

8.  The  symmetric  difference  of  two  sets  A  and  B,  denoted  A  ©  B  is 
the  set  which  contains  those  elements  which  are  in  A  or  B  but  not  in 
both  A  and  B. 

(a)  Draw  a  Venn  diagram  depicting  A  ©  B. 

(b)  Draw  Venn  diagrams  to  justify  the  following  laws. 

i.  A®B  =  ( A\B )  U  (B\A). 
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ii.  A®  B  =  (j4uB)\  {A  n  B). 

(c)  What  propositional  connective  does  ©  correspond  to? 

9.  Use  the  Inclusion- Exclusion  Principle  of  Fact  2.11  to  show  the  following 
three-set  version:  for  finite  sets  A,  B  and  C , 

|j4  U  B  U  C|  =  \A\  +  \B\  +  |Cj 

-  \A  n  B\  -  \A  n  c\  -  \B  n  Cj 
+  \Ar\Bc\C\. 

Explain  informally  why  this  principle  holds. 

10.  Use  the  three-set  version  of  the  Inclusion-Exclusion  Principle  from  the 
previous  exercise  to  solve  the  hospital  problem  of  Example  2.26. 

11.  Felix,  Oskar  and  Amanda  play  a  game  to  see  who  can  list  the  most 
countries  in  five  minutes.  They  each  make  a  list,  and  after  five  minutes 
they  compare  these  lists,  crossing  off  any  countries  that  are  on  more 
than  one  list. 

Felix  had  listed  the  most  countries,  29,  but  they  were  mostly  common 
countries  that  the  other  two  got:  in  fact,  23  of  them  were  on  Oskar’s 
list,  and  12  of  these  23  were  also  on  Amanda’s  list. 

Amanda  had  listed  the  fewest  countries,  22,  but  -  with  more  than  a 
little  help  from  Joel  -  she  had  come  up  with  many  obscure  countries: 
she  had  listed  seven  countries  that  were  not  on  Felix’s  list,  and  nine 
countries  that  were  not  on  Oskar’s  list. 

After  crossing  out  all  of  the  duplicated  countries,  they  were  left  with 
a  total  of  13  countries  on  their  lists. 

Who  won  the  game? 

12.  The  ordered  pair  ( x ,  y )  can  be  defined  as  the  set  {  {a;},  { x ,  y }  }. 

(a)  With  this  definition,  show  that  (x,  y)  =  ( u ,  v )  if,  and  only  if,  x~u 
and  y=v. 

(b)  Why  can  we  not  define  the  ordered  pair  as  (x,  y)  =  {  x,  {y}  }? 

13.  In  a  certain  town  lives  a  barber,  who  is  a  man,  who  shaves  every  man 
in  the  town  who  does  not  shave  himself. 

The  question  is:  Who  shaves  the  barber?  Explain  your  answer. 

14.  An  adjective  is  autological  if  it  describes  itself.  For  example,  “short” 
is  autological  since  it  is  short;  and  “pentasyllable”  is  autological  since 
it  is  pentasyllable;  that  is,  it  has  five  syllables.  Any  adjective  that  is 
not  autological  is  said  to  be  heterological.  For  example,  “long”  and 
“monosyllabic”  are  heterological. 
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The  question  is:  Is  “heterological”  autological,  or  is  it  heterological? 
Explain  your  answer.  What  about  “autological”? 


Chapter  3 

*  Boolean  Algebras  and  Circuits 


There  are  10  types  of  people  in  this  world:  those  who  understand 
binary  numbers  and  those  who  don’t. 

-  Anonymous. 

At  the  end  of  the  last  chapter  we  noted  a  close  analogy  between  Equivalence 
Laws  for  Propositional  Logic  on  the  one  hand,  and  Set  Identities  on  the 
other.  In  this  chapter  we  explore  this  connection  by  looking  at  Boolean 
algebras,  the  mathematical  structures  underlying  both  propositional  logic 
and  sets. 

This  analogy  extends  to  the  world  of  digital  computers  and  other  elec¬ 
tronic  devices,  which  are  built  from  circuits  which  have  binary  inputs  and 
outputs;  that  is,  they  manipulate  values  from  the  set  B  =  {0, 1}.  At  the  im¬ 
plementation  level  these  binary  inputs  and  outputs  are  delivered  by  voltages 
on  wires,  with  a  low  voltage  being  interpreted  as  0  and  a  high  voltage  being 
interpreted  as  1.  The  simplest  components  of  digital  circuits,  logic  gates, 
are  based  on  the  connectives  of  propositional  logic,  with  0  (low  voltage)  and 
1  (high  voltage)  being  interpreted  as  F  (false)  and  T  (true),  respectively. 
Composing  logic  gates  together  to  create  ever  more  complicated  electronic 
components  can  thus  be  done  in  a  way  which  is  amenable  to  analysis  via 
propositional  logic.  In  this  chapter  we  shall  examine  the  fundamental  role 
of  Boolean  algebra  in  underlying  the  building  blocks  of  digital  computers. 

(3^1)  Boolean  Algebras 

A  Boolean  algebra  is  a  set  B  which  contains  (at  least)  two  distinct  special 
elements  0  and  1,  referred  to  as  zero  and  unit,  respectively,  along  with  two 
binary  operators  +  and  •,  referred  to  as  sum  and  product,  as  well  as  a 
unary  operator  ',  referred  to  as  complementation.  That  is,  for  every  pair 
(a;,  y)  of  elements  of  B  there  are  three  further  (but  not  necessarily  different) 
elements  of  B  denoted  x+y,  x-y,  and  x' .  These  operators  must  all  satisfy 
the  ten  Laws  of  Boolean  Algebra  given  in  Figure  3.1. 
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Commutativity: 

x  +  y 

=  y  +  x 

(  Comml ) 

x  •  y 

=  y  ■  x 

(  CommZ) 

Associativity: 

(x  +  y)  +  z 

=  x  +  (y  +  z ) 

(Assocl ) 

(x-y)-z 

=  x  ■  (y  ■  z) 

(AssocZ) 

Distributivity: 

x  +  (y  ■  z) 

=  {x  +  y)  ■  (x  +  z) 

(Distrl ) 

x  ■  (y  +  z) 

=  (x  •  y)  +  (x  •  z) 

( DistrZ ) 

Identity: 

x  +  0 

=  X 

(Identl ) 

X  ■  1 

=  X 

(I  dents) 

Complement: 

x  +  x' 

=  1 

( Compll ) 

X  ■  x' 

=  0 

o 

1 

Figure  3.1:  The  Laws  of  Boolean  Algebra. 


Boolean  algebras  provide  an  abstract  representation  of  familiar  ideas  in 
various  areas  of  study.  Indeed  we  have  already  met  concrete  examples  of 
Boolean  algebras  in  the  form  of  sets  and  propositions. 


Example  3.1J  The  Boolean  Algebra  of  Sets 

The  power  set  V  ( U )  of  a  set  U  gives  rise  to  a  Boolean  algebra,  with  the 
roles  of  0,  1,  +,  •  and  '  taken  by  0,  U,  U,  n  and  ",  respectively. 

In  this  case,  the  laws  give  rise  to  the  following  set  identities,  which  we 
confirmed  in  Section  2.7: 


Commutativity: 

Au  B 

=  BuA 

( Comml ) 

An  B 

=  BnA 

(  CommZ) 

Associativity: 

(AuB)uC 

=  Au(BuC) 

(Assocl ) 

{A  n  B)  n  c 

=  An(BnC) 

(AssocZ) 

Distributivity: 

Au(BnC) 

=  (AuB)n(AuC) 

(Distrl ) 

An  (Sue) 

=  (AnB)u(An  c) 

(DistrZ) 

Identity: 

A  u  0 

=  A 

(Identl ) 

AnU 

=  A 

(I  dents) 

Complement: 

AU  A 

=  u 

(  Compll ) 

A  n  A 

=  0 

o 

o 

1 
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Example  3.2 j  The  Boolean  Algebra  of  Propositions _ 

The  set  of  propositions  gives  rise  to  a  Boolean  algebra,  with  the  roles  of  0, 
1,  +,  •  and  '  taken  by  false,  true,  V,  A  and  -i,  respectively.  (Equality  p  =  q 
is  interpreted  by  equivalence  p  o  q.) 

In  this  case,  the  laws  give  rise  to  the  following  equivalences,  which  we 
confirmed  in  Section  1.7: 


Commutativity: 

p\/q 

<^> 

qVp 

( Comml ) 

phq 

<^> 

qAp 

(CommZ) 

Associativity: 

(p  V  g)  V  r 

<^> 

"S3 

< 

< 

(Assocl ) 

(p  A  q)  A  r 

<^> 

"S3 

> 

> 

(AssocZ) 

Distributivity: 

p  V  (q  A  r) 

<^> 

(p  V  q)  A  (p  V  r) 

(Distrl ) 

p  A  (q  V  r) 

<^> 

(p  A  q)  V  (p  A  r) 

(DistrZ) 

Identity: 

p  V  false 

<^> 

V 

(Identl ) 

p  A  true 

<^> 

V 

(I  dents) 

Complement: 

pVnJ) 

<^> 

true 

( Compll ) 

pAnp 

<^> 

false 

(  ComplS) 

Example  3.3J  The  two-valued  Boolean  Algebra _ 

The  two-element  set  B  =  {0, 1}  itself  gives  rise  to  an  important  Boolean 
algebra,  with  the  operations  defined  as  follows: 


X 

y 

x+y 

X 

y 

x-y 

X 

x1 

0 

0 

0 

~0~ 

0 

0 

0 

l 

0 

l 

1 

0 

l 

0 

1 

0 

l 

0 

1 

1 

0 

0 

l 

l 

1 

1 

l 

1 

As  we  shall  see,  this  particular  algebra  is  of  fundamental  importance  in  the 
design  of  digital  circuits. 


(^Exercise  3.3)  (Solution  on  page  421) _ 

Verify  that  the  laws  of  Boolean  algebra  hold  for  the  two-valued  Boolean 
algebra  B. 


Prom  now  on  we  shall  typically  omit  •  and  write  xy  rather  than  x-y,  and 
freely  omit  parentheses  by  allowing  •  to  bind  tighter  than  +  and  '  to  bind 
tighter  than  •;  thus  for  example,  we  shall  write  x  +  (y-(z'))  simply  as  x  +  yz' . 
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(3^2)  Deriving  Identities  in  Boolean  Algebras 

From  the  Laws  of  Boolean  Algebra,  we  can  derive  very  many  identities  which 
must  be  true  in  any  Boolean  algebra  (in  particular,  as  set  identities  and  log¬ 
ical  equivalences).  In  this  section  we  derive  some  important  identities,  and 
leave  it  as  an  exercise  to  consider  what  these  identities  say  as  set  identities 
and  logical  equivalences,  many  of  which  were  derived  already  in  previous 
chapters.  We  shall  state  our  new  identities  as  Theorems  (true  statements), 
and  justify  their  truth  using  proofs  (step-by-step  derivations  of  their  truth); 
the  appearance  of  the  box  symbol  indicates  the  end  of  a  proof. 


(x  +  y)z  =  xz  +  yz  (Distr3) 

xy  +  z  =  (x  +  z){y  +  z)  (Distr4) 


Proof:  We  prove  only  the  first  identity,  and  leave  the  second  as  an  exercise. 
(x  +  y)z  =  z(x  +  y)  (CommZ) 

=  zx  +  zy  (DistrZ) 

=  xz  +  yz  ( CommZ,  twice )  □ 


x  +  x  =  x  (Idempl) 

xx  =  x  (IdempZ) 


Proof:  We  prove  only  the  first  identity,  and  leave  the  second  as  an  exercise. 
x  +  x  —  (x  +  x)l  (IdentZ) 

=  (x  +  x)(x  +  x')  ( Compll ) 

=  x  +  xx'  (Distrl ) 

=  x  +  0  (ComplZ) 

=  x  (I  dent  1 )  □ 


x  +  1  =  1  (Doml ) 

x  0  =  0  (DomZ) 
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Proof:  We  prove  only  the  first  identity,  and  leave  the  second  as  an  exercise. 


x  +  1  =  x  +  (x  +  x')  ( Compll ) 

—  (x  +  x)  +  x'  (Assocl) 

=  x  +  x’  (Idempl ) 

=  1  ( Compll ) 


□ 


(^Theorem  3.6 


Absorption  Laws 


x  +  xy  =  x 
x(x  +  y)  —  x 


(Absorpl ) 
(AbsorpZ) 


Proof:  We  prove  the  first  identity  here,  and  present  the  proof  of  the  second 
in  Example  3.11. 


x  +  xy  =  xl  +  xy 

(IdentZ) 

=  x(l  +y ) 

(DistrZ) 

=  x(y  +  1) 

( Comml ) 

=  xl 

(Doml ) 

=  X 

(IdentZ) 

Next,  we  prove  a  law  that  we  shall  find  useful  in  further  calculations. 

(^Theorem  3.7^) 

If  x  +  y  =  x  +  z  and  xy  =  xz  then  y  =  z. 


Proof: 


=  y{x  +  y) 

(Comml,  AbsorpZ) 

=  y(x  +  z) 

(Assunption  1 ) 

=  yx  +  yz 

(DistrZ) 

=  zx  +  zy 

(CommZ,  Assumption  Z) 

=  z(x  +  y) 

(DistrZ) 

=  z(x  +  z) 

(Assumption  1 ) 

=  z 

(Comml,  AbsorpZ) 
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Next,  we  consider  a  few  results  about  complementation.  The  first  of 
these  is  the  observation  that  the  two  Complementation  Laws  x  +  x'  =  1  and 
xx1  =  0  uniquely  determine  the  complement:  there  is  no  value  y  different 
from  x'  which  satisfies  these  two  equations. 


If  i  +  j  =  1  and  xy  =  0  then  y  =  x' .  That  is  to  say,  x'  is  the  only 
element  which  satisfies  x  +  x'  =  1  and  xx'  =  0. 


( Compll ) 

(Comml,  Compll) 

(  ComplS) 

(CommZ,  ComplZ) 


□ 


(x  +  y)'  =  x'y'  (DeMorganl) 

(xy)'  =  x'  +  y'  (DeMorganZ) 
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Proof:  We  prove  only  the  first  identity,  and  leave  the  second  as  an  exercise. 
We  first  note  that  it  suffices  to  show  that 

(x  +  y)(x'y')  =  0  and  (x  +  y)  +  {x'y')  =  1 

as  then,  by  the  Uniqueness  of  Complement  Theorem  3.8,  we  would  get  that 
{x  +  y)'  =  x'y'. 


(x  +  y)  (x'y1) 


x(x'y')  +  y{x'y') 
0  y‘  +  Ox' 

0  +  0 
0 


(Distr3) 

(Assoc2,  CommZ,  ComplZ) 
(DomZ) 

(Idempl ) 


{x  +  y)  +  (x’y1)  = 


((x  +  y)  +  x')((x  +  y)  +  y')  (Distrl) 

(y  +  l)(a:  +  1)  (Assocl,  Comml,  Compll ) 

1  •  1  (Doml ) 

1  (IdempZ)  □ 


Exercise  3.10J  (Solution  on  page  422) _ 

Prove  the  following  theorems. 

1.  (xy  +  x'y')1  =  xy'  +  x'y. 

2.  If  x+y  =  x+z  and  x'+y  —  x'+z  then  y  —  z. 

3.  If  x+y  =  0  then  x  =  y  =  0. 

4.  x  =  0  if,  and  only  if,  y  —  xy'  +  x'y  for  all  y. 


(T3)  The  Duality  Principle 

Given  any  formula  in  a  Boolean  algebra,  its  dual  is  formed  by  interchanging 
0  and  1,  and  +  and  •,  throughout.  More  generally,  the  dual  of  a  statement 
involving  Boolean  algebra  is  that  statement  with  every  formula  replaced 
with  its  dual.  Thus  for  example,  the  dual  of  x  +  y'z  =  1  is  x(y'+z)  =  0. 
The  following  is  a  fundamental  principle  of  Boolean  algebras. 

(Theorem  3.11^)  The  P  rinciple  of  Duality 

The  dual  of  every  theorem  of  Boolean  algebra  is  also  a  theorem. 

Proof:  To  see  that  this  is  a  valid  principle,  we  merely  need  realise  that  a 
proof  of  a  theorem  becomes  a  proof  of  the  dual  of  the  theorem  simply  by 
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replacing  each  formula  used  in  the  proof  by  its  dual.  This  is  so  since  the 
Laws  of  Boolean  Algebra  consist  of  five  statements  and  their  duals.  □ 


(^Example  3.1lT) 

Consider  the  following  derivation  of  the  second  Absorption  Law  x(x+y)  —  x: 


x(x  +  y)  =  (x  +  0)(a:  +  y)  (Identl) 

=  x  +  Oy  (Distrl ) 

=  x  +  yO  (Comm2) 

=  x  +  0  (Dom2) 

=  x  (Identl ) 


If  we  compare  this  derivation  with  that  given  in  the  proof  of  the  first  Ab¬ 
sorption  Law  x  +  ( xy )  =  x  in  Theorem  3.6,  the  duality  is  immediately 
apparent:  the  two  derivations  are  identical,  but  for  the  fact  that  each  ex¬ 
pression  is  replaced  by  its  dual,  and  the  identity  justifying  each  step  in  the 
above  derivation  is  the  dual  of  the  identity  justifying  the  same  step  in  the 
first  derivation. 


This  principle  allows  us  to  infer  the  validity  of  the  dual  of  any  theorem 
that  we  prove,  since  a  proof  of  the  dual  theorem  can  be  constructed  auto¬ 
matically  from  the  proof  of  the  theorem,  simply  by  replacing  every  formula 
and  identity  with  its  dual,  as  in  the  above  example.  Throughout  the  previ¬ 
ous  section  we  provided  theorems  presenting  pairs  of  identities;  and  in  each 
case  we  only  proved  the  first  of  each  identity,  leaving  the  proof  of  the  second 
as  an  exercise.  In  fact,  the  second  identity  in  each  case  is  the  dual  of  the 
first;  so  by  using  the  Duality  Principle,  proofs  of  these  are  unnecessary.  The 
Duality  Principle  guarantees  that  they  are  valid. 

(^Exercise  3.11/)  (Solution  on  page  423) 

Write  out  the  dual  of  each  of  the  following  theorems  from  Exercise  3.10. 

1.  (xy  +  x'y')'  =  xy'  +  x'y. 

2.  If  x+y  =  x+z  and  x'+y  =  x'+z  then  y  =  z. 

3.  If  x+y  =  0  then  x  =  y  =  0. 

4.  x  =  0  if,  and  only  if,  y  =  xy'  +  x'y  for  all  y. 
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Logic  Gates  and  Digital  Circuits 


Computers  manipulate  all  forms  of  information:  numbers,  names,  sounds, 
pictures,  videos;  but  an  electronic  computer  can  only  reliably  represent  data 
in  essentially  one  way:  either  a  wire  has  a  high  voltage,  or  it  has  a  low  volt¬ 
age.  By  interpreting  a  high  voltage  as  the  number  1  and  a  low  voltage  as  the 
number  0,  every  piece  of  data  represented  and  manipulated  by  an  electronic 
computer  is  reduced  within  the  electronics  of  the  machine  to  combinations 
of  the  binary  digits  (the  bits)  0  and  1. 

At  its  lowest  level,  a  computer  manipulates  this  binary  data  using  dig¬ 
ital  circuits  which  transform  voltages  on  wires  feeding  into  the  circuit  into 
voltages  on  wires  leading  out  from  it.  How  the  electronics  works  (using 
transistors)  to  cause  the  output  voltages  to  reflect  the  correct  values  accord¬ 
ing  to  the  input  voltages  is  not  a  question  that  will  concern  us  here;  such 
concerns  are  left  to  physicists  and  electronics  engineers. 


Example  3.12 


Consider  a  circuit  HA  with  two  input  wires,  labelled  x  and  y,  and  two  output 
wires,  labelled  s  and  c.  Such  a  circuit  might  be  represented  as  follows: 


s 

c 


Note  that  when  we  draw  a  circuit,  we  will  assume  that  its  input  lines  enter 
from  the  left  and  its  output  lines  exit  from  the  right. 

Such  a  picture  may  represent  the  circuit  simply  as  a  black  box  as  above, 
with  no  indication  as  to  how  the  output  values  relate  to  the  input  values. 
However,  we  can  describe  the  behaviour  of  the  circuit  by  indicating  what 
output  values  are  produced  from  each  of  the  possible  input  values.  To  do 
this,  we  can  list  all  of  the  possibilities  in  the  form  of  a  truth  table.  For 
example,  the  circuit  HA  which  we  have  in  mind  above  behaves  as  follows: 


X 

v 

S 

c 

0 

0 

0 

0 

0 

l 

l 

0 

l 

0 

l 

0 

l 

l 

0 

1 

Thus,  for  example,  if  both  input  wires  x  and  y  hold  high  voltages,  thus  both 
representing  the  value  1,  then  the  output  wire  s  will  be  given  a  low  voltage, 
representing  the  value  0,  and  the  output  wire  c  will  be  given  a  high  voltage, 
representing  the  value  1. 


Computer  circuits  can  be  extremely  complicated  -  far  more  complicated 
than  the  above  example.  However,  all  circuits,  including  the  one  above,  can 
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be  built  up  from  three  very  basic  building  blocks  (which  can  all  be  easily 
implemented  using  transistors):  OR  gates,  AND  gates  and  NOT  gates. 

An  OR  gate  is  a  simple  component  circuit  which  takes  two  inputs  x 
and  y  and  produces  the  single  output  x+y  defined  by 

f  1  if  x=l  or  y= 1; 
x+y  =  ^ 

0  otherwise. 


Graphically  it  is  drawn  as  follows: 


An  AND  gate  is  a  simple  component  circuit  which  takes  two  inputs  x 
and  y  and  produces  the  single  output  x-y  defined  by 


(  1  if  x~\  and  y= 1; 
x-y  =  \ 

0  otherwise. 


Graphically  it  is  drawn  as  follows: 


A  NOT  gate  is  a  simple  component  circuit  which  takes  one  input  x  and 
produces  the  single  output  x'  defined  by 

=  /  1  %f  *=0; 

\  0  if  x=l. 


Graphically  it  is  drawn  as  follows: 


Truth  tables  defining  these  three  gates  are  as  follows: 


X 

y 

x+y 

X 

v 

x-y 

X 

x1 

0 

0 

0 

IT 

0 

0 

0 

l 

0 

l 

1 

0 

l 

0 

1 

0 

l 

0 

1 

l 

0 

0 

l 

l 

1 

l 

l 

1 

We  can  observe  from  the  above  definitions  that  the  three  basic  gates  com¬ 
pute  exactly  the  functions  of  the  two-valued  Boolean  algebra  B  defined  in 
Example  3.3.  (Note  that,  as  before,  we  shall  typically  write  xy  instead  of 
x-y.)  This  section  makes  clear,  then,  the  fundamental  importance  of  this 
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particular  Boolean  algebra.  It  is  absolutely  essential  in  the  design  of  digital 
computers. 

We  can  build  large  complicated  circuits  from  these  three  basic  gates 
by  stringing  them  together  -  always  in  a  left-to-right  fashion.  (Allowing 
feedback  wires  provides  its  own  uses  -  and  complications  -  which  we  shall 
not  explore.) 

(^Example  3.13^) _ 

Consider  the  following  circuit: 

x 

y 

z 

There  are  three  inputs  x,  y  and  2  to  this  circuit.  The  inputs  x  and  y  feed 
into  an  AND  gate  which  outputs  an  intermediate  value  u  =  xy.  Meanwhile, 
the  input  2:  feeds  into  a  NOT  gate  which  outputs  a  second  intermediate  value 
v  —  z' .  The  two  intermediate  values  u  and  v  output  by  the  first  two  gates 
then  feed  as  inputs  into  an  OR  gate  which  outputs  the  final  value  w  =  u  +  v. 
The  effect  of  the  whole  circuit,  therefore,  is  to  output  the  value  w  =  xy  +  z1. 
The  value  that  is  output  is  thus  given  according  to  the  following  table: 


X 

y 

U 

V 

w 

0 

0 

0 

0 

1 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

1 

1 

0 

1 

1 

0 

0 

0 

1 

0 

0 

0 

1 

1 

1 

0 

1 

0 

0 

0 

1 

1 

0 

1 

1 

1 

1 

1 

1 

1 

0 

1 

For  example,  if  the  inputs  have  values  x=l,  y— 0  and  z= 1,  then  the  output 
of  the  AND  gate  will  be  0,  as  will  the  output  of  the  NOT  gate;  and  since 
both  of  the  inputs  to  the  OR  gate  will  be  0,  the  value  of  the  output  w  will 
also  be  0: 


The  relationship  between  the  table  defining  the  function  w  =  xy  +  z1  and 
the  truth  table  for  the  proposition  ( P  A  Q)  V  -1 R  is,  hopefully,  obvious. 
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(^Example  3.14^ _ 

Consider  the  following  circuit  with  three  input  lines  a,  b  and  c,  and  one 
output  line  m: 


Note  that  we  have  used  a  dot  to  split  a  line,  directing  the  same  value  (volt¬ 
age)  to  two  different  inputs;  and  we  have  allowed  lines  to  cross  without 
interference  (as  if  they  were  insulated  from  each  other). 

We  can  analyse  the  behaviour  of  this  circuit  as  follows: 

•  the  inputs  a  and  b  feed  into  the  first  AND  gate  to  produce  an  inter¬ 
mediate  value  x  =  ab ; 

•  the  inputs  a  and  c  feed  into  the  second  AND  gate  to  produce  an 
intermediate  value  y  =  ac; 

•  the  inputs  b  and  c  feed  into  the  third  AND  gate  to  produce  an  inter¬ 
mediate  value  x  =  be, 

•  the  values  x  and  y  then  feed  into  the  first  OR  gate  to  produce  a  further 
intermediate  value  w  =  x  +  y, 

•  finally,  the  values  w  and  2  feed  into  the  second  OR  gate  to  produce 
the  final  value  m  —  w  +  z. 

We  can  tabulate  the  value  that  is  output  by  this  circuit  on  any  set  of  inputs 
as  follows: 


a 

b 

c 

X 

y 

W 

m 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

1 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

1 

0 

1 

0 

1 

0 

1 

1 

1 

1 

0 

1 

0 

0 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

Algebraically,  the  effect  of  the  circuit  is  to  output  the  value 
m  =  w  +  z  =  x  +  y  +  z  =  ab  +  ac  +  be. 
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In  other  words,  this  circuit  computes  the  majority  function:  the  output  m 
will  be  1  exactly  when  at  least  two  of  the  input  values  are  1. 


(^Exercise  3.14)  (Solution  on  page  423) _ 

The  exclusive-OR  gate,  or  XOR  gate,  has  the  following  definition  (and 
gate  symbol): 


X 

y 

x  ©  y 

0 

0 

0 

0 

l 

1 

l 

0 

1 

l 

l 

0 

;=)D — z=x(By 


That  is,  the  output  z  =  x®y  has  the  value  1  when  exactly  one  of  the  inputs 
x  or  y  has  the  value  1  (and  the  other  has  the  value  0). 

Build  a  circuit  which  realises  this  gate. 


(^Exercise  3.15^)  (Solution  on  page  423) 


Describe  the  behaviour  of  the  following  circuit  by  providing  a  Boolean  ex¬ 
pression  and  a  truth  table  defining  the  output  value  r. 


Exercise  3.16J  (Solution  on  page  424) _ 

Consider  a  car  safety  system  in  which  a  warning  bell  rings  whenever  the 
motor  is  running  while  a  door  is  open  or  a  seat  belt  is  unbuckled.  This  is 
to  be  implemented  as  a  Boolean  circuit  which  takes  three  inputs  M,  D  and 
B,  respectively  representing  the  states  of  the  motor,  doors  and  seat  belts: 

•  M  will  be  1  if  the  motor  is  running  and  0  otherwise; 

•  D  will  be  1  if  the  doors  are  closed  and  0  otherwise; 

•  B  will  be  1  if  the  seat  belts  are  fastened  and  0  otherwise. 

The  circuit  is  to  produce  a  single  output  R  which  should  be  1  if  the  warning 
bell  should  ring  and  0  otherwise.  Build  a  circuit  for  this  system. 
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(3^5)  Making  Computers  Add 

In  this  section  we  consider  the  problem  of  constructing  a  circuit  which  will 
add  two  integers.  To  do  this,  we  must  first  understand  how  integers  are 
represented  and  manipulated  by  a  computer  using  just  the  binary  digits. 

3.5.1  Binary  Numbers 

People  have  ten  fingers,  and  children  learn  early  on  to  count  using  the  ten 
digits  on  their  hands.  When  counting  beyond  ten  on  your  fingers,  the  natural 
thing  to  do  is  to  keep  track  of  how  many  times  you  run  through  your  fingers, 
which  you  can  ask  someone  else  to  do  using  their  ten  fingers.  Then  a  third 
person  in  turn  can  use  their  ten  fingers  to  keep  track  of  how  many  times  the 
second  person  runs  through  all  of  their  fingers,  which  happens  every  time 
you  reach  100  (i.e.,  each  time  you  run  through  your  own  ten  fingers  ten 
times).  When  the  third  person  runs  out  of  fingers,  you  will  have  counted 
ten  lots  of  100,  i.e.  up  to  1000.  If  you  are  still  counting,  a  fourth  person 
can  use  their  ten  fingers  to  keep  track  of  how  many  lots  of  1000  you  have 
counted.  A  fifth  person  can  then  keep  track  of  how  many  lots  of  10000  are 
counted;  a  sixth  how  many  lots  of  100000;  etc. 

This  mechanism  for  counting  is  reflected  in  our  use  of  decimal  numbers, 
which  is  a  positional  notation  for  expressing  quantities.  For  example,  when 
we  write  the  decimal  number  6538,  we  interpret  the  four  digits  as  follows: 

•  the  8  in  the  rightmost  position  represents  8  ones; 

•  the  3  in  the  second  position  from  the  right  represents  3  lots  of  tens; 

•  the  5  in  the  third  position  from  the  right  represents  5  lots  of  hundreds 
(ie,  tens  of  tens);  and 

•  the  6  in  the  fourth  position  from  the  right  represents  6  lots  of  thousands 
(ie,  tens  of  hundreds,  or  tens  of  tens  of  tens). 

That  is, 


6  x 

103 

=  6  x 

II 

O 

O 

O 

6000 

+ 

5  x 

102 

=  5  x 

100  = 

500 

+ 

3  x 

101 

=  3  x 

10  = 

30 

+ 

8  x 

10° 

=  8  x 

1  = 

_ 8 

6538 

Digital  computers  have  access  to  only  the  two  binary  digits,  0  and  1, 
not  the  ten  decimal  digits.  Therefore,  they  naturally  represent  quantities 
as  binary  numbers  rather  than  decimal  numbers,  which  are  sequences  of 
binary  digits  (bits)  rather  than  decimal  digits.  For  example,  the  binary 
number  11101  is  interpreted  as  follows: 
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•  the  1  in  the  rightmost  position  represents  1  one; 

•  the  0  in  the  second  position  from  the  right  represents  0  lots  of  twos; 

•  the  1  in  the  third  position  from  the  right  represents  1  lot  of  fours  (ie, 
twos  of  twos); 

•  the  1  in  the  fourth  position  from  the  right  represents  1  lot  of  eights 
(ie,  twos  of  fours,  or  twos  of  twos  of  twos);  and 

•  the  1  in  the  fifth  position  from  the  right  represents  1  lot  of  sixteens 
(ie,  twos  of  eights,  or  twos  of  twos  of  twos  of  twos). 

That  is, 

11101  =  1  x  24  =  1  x  16  =  16 

+  lx23  =  1x8  =  8 

+  lx22  =  1x4  =  4 

+  0  x  21  =  0x2  =  0 

+  1x2°  =  lxl  =  _ 1 

29 

Any  natural  number  can  be  represented  as  a  binary  number,  just  as  easily 
as  it  can  be  represented  as  a  decimal  number.  The  method  for  translating 
from  binary  to  decimal  can  be  extracted  from  the  above  description;  and  the 
method  for  translating  from  decimal  to  binary  is  almost  as  easy:  we  merely 
have  to  keep  subtracting  from  the  number  in  question  the  largest  power  of 
two  that  we  can. 


To  translate  the  decimal  value  51  into  binary: 

•  subtract  25  =  32  from  51  to  give  a  remainder  of  19; 

•  subtract  24  =  16  from  19  to  give  a  remainder  of  3; 

•  subtract  21  =  2  from  3  to  give  a  remainder  of  1; 

•  subtract  2°  =  1  from  1  to  give  a  remainder  of  0. 

We  can  thus  express  the  decimal  number  51  =  32  +  16  +  2  +  1  in  binary  as: 
110011  =  1x2s  =  1x32  =  32 

+  1  x  24  =  1  x  16  =  16 

+  0  x  23  =  0x8  =  0 

+  0  x  22  =  0x4  =  0 

+  1  x  21  =  1x2  =  2 

+  1x2°  =  lxl  =  1 

51 
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3.5.2  Adding  Binary  Numbers 

Consider  how  we  would  naturally  add  two  (decimal)  numbers  by  hand.  We 
would  first  line  the  numbers  up  one  on  top  of  the  other.  Then  we  would 
add  the  units,  writing  down  the  unit  sum  digit  and  moving  the  carry  digit 
(if  there  is  one)  to  the  top  of  the  tens  column;  then  we  would  add  the  three 
numbers  in  the  tens  column,  writing  down  the  tens  sum  digit  and  moving 
the  carry  digit  to  the  top  of  the  hundreds  column;  then  we  would  add  the 
three  numbers  in  the  hundreds  column,  writing  down  the  hundreds  sum 
digit  and  moving  the  carry  digit  to  the  top  of  the  thousands  column;  and 
continue  doing  this  same  calculation  with  each  column  from  right  to  left. 

This  same  method  works  equally  well  for  binary  numbers,  and  is  the 
basis  for  how  digital  computers  add  numbers  represented  in  binary. 

(^Example  3.17^ 

To  add  the  two  binary  numbers  11101  and  10110,  write  them  one  over  the 
other  and  add  the  bits  column- wise  from  right  to  left,  including  carries  where 
necessary,  as  indicated: 


1110  1 
10  110 
110011 

The  first  two  columns  on  the  right  each  gives  a  sum  of  1  with  no  carry;  but 
the  third  column  from  the  right  give  a  0  sum  with  a  carry,  as  then  does  the 
fourth  column.  The  fifth  column  gives  a  sum  of  1  with  a  carry,  which  gives 
a  sum  of  1  for  the  new  sixth  column. 


(^Exercise  3.17J)  (Solution  on  page  424) 

What  decimal  sum  is  being  calculate  in  the  above  example? 


We  are  now  in  a  position  to  design  a  digital  circuit  which  adds  two  in¬ 
tegers  represented  as  binary  numbers.  More  specifically,  we  shall  build  a 
circuit  which  will  have  8  input  lines  representing  two  4-bit  binary  num¬ 
bers  a3a2aia0  and  b3b2bib0,  and  5  output  lines  representing  the  5-bit  binary 
number  S4S3S2S1S0  resulting  from  adding  a3a2aia0  and  b3b2bib0'. 


O-Q 

ai 

a2 

a3 

bo 

bi 

&3 


50 

51 

4-bit 

- S2 

Adder 

s3 

S4 
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The  construction  we  give  can  be  easily  scaled  up  to  add  arbitrarily-long  bit 
strings. 


3.5.3  Building  Half  Adders 

The  basic  component  from  which  we  shall  build  our  4-bit  adder  is  the  circuit 
HA  from  Example  3.12  (page  95),  which  takes  the  two  inputs  x  and  y  and 
produces  the  two  outputs  s  and  c  representing  the  sum  of  x  and  y,  with  s 
being  the  sum  bit  and  c  being  the  carry  bit.  Such  a  circuit  is  called  a  half 
adder. 

Our  first  task  is  to  express  the  outputs  in  terms  of  the  functions  of  the 
basic  gates.  For  a  start,  computing  the  carry  bit  c  is  obvious:  being  1  exactly 
when  both  x  and  y  are  1,  it  is  their  product  c  =  xy.  The  sum  bit  s  is  only 
slightly  more  cumbersome.  It  is  1  when  one  of  the  inputs  is  1  and  the  other 
is  0:  s  =  x'y+xy' . 

Towards  building  these  functions  in  a  circuit  using  the  three  basic  gates, 
we  can  first  note  the  following  two  circuits  that  compute  x'y  and  xy1,  re¬ 
spectively: 


These  can  then  be  combined  to  give  a  circuit  for  x'y+xy'  as  follows: 


The  above  circuit  computes  the  sum  bit  of  the  half  adder;  it  only  remains 
to  add  a  further  AND  gate  which  computes  the  carry  bit  to  complete  the 
circuit: 


s  =  x'y+xy' 


c  =  xy 


This  circuit  consists  of  six  gates:  three  AND  gates,  two  NOT  gates  and 
one  OR  gate.  The  question  then  arises:  is  it  possible  to  build  a  simpler 
circuit  which  performs  the  computation  of  a  half  adder.  Such  questions 
are  important  when  contemplating  fitting  ever-more  computing  power  on 
a  computer  chip;  you  would  certainly  want  to  find  the  smallest  possible 
circuits  to  compute  the  functions  that  are  implemented  on  the  chip. 

Using  the  laws  of  Boolean  algebra,  we  can  make  the  following  calculation: 
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x'y  +  xy' 


xy'  +  x'y 
{xy  +  x'y')' 
{xy)'{x'y'y 
{xy)'{x"  +  y") 
(a :y)'{x  +  y) 
{x+y){xy)‘ 


( commutativity ) 
(Exercise  3.10(1)) 
(DeMorganl ) 
(DeMorganZ) 
(Involution,  twice) 
( commutativity ) 


This  complicated  calculation,  in  fact,  tells  us  something  natural:  that  having 
one  input  line  holding  the  value  1  and  the  other  holding  the  value  0:  x'y+xy' 
is  the  same  as  having  one  of  the  input  lines  holding  the  value  1  and  not 
having  both  input  lines  holding  the  value  1:  {x+y){xy)' . 

Indeed,  such  intuitive  observations  are  where  ideas  for  optimisations  typ¬ 
ically  arise.  The  above  derivation  was  a  necessary  step  in  the  design  process, 
in  justifying  the  intuition  which  suggested  the  optimisation. 

Importantly,  the  final  expression  {x+y){xy)'  is  simpler  to  evaluate  than 
x'y+xy' ,  requiring  only  four  basic  operations  rather  than  five;  moreover,  the 
product  xy,  is  calculated  in  the  process,  so  we  need  no  further  operations 
to  complete  the  half  adder  circuit.  The  corresponding  circuit  is  as  follows: 


We  have  thus  managed  to  build  the  half  adder  using  four  gates  instead  of 
six.  Improving  designs  like  this  in  order  to  reduce  the  number  of  gates  -  in 
this  case  by  a  third  -  is  of  obvious  importance  when  it  comes  to  fitting  more 
power  within  the  limited  space  on  a  circuit  board.  Reasoning  with  Boolean 
algebra  is  a  crucial  activity  in  the  design  of  computer  processors. 


3.5.4  Building  Full  Adders 


We  are  designing  a  circuit  which  will  add  two  binary  numbers  using  the  usual 
method  of  summing  bits  column-by-column.  So  far  we  have  constructed  a 
half  adder  which  takes  two  bits  and  adds  these  together,  producing  a  sum 
bit  and  a  carry  bit.  However,  we  will  also  need  a  circuit  which  adds  not  just 
two  digits,  as  the  half  adder  does,  but  rather  three  digits,  to  cater  for  the 
carry  bit.  Such  a  circuit  is  called  a  full  adder  and  has  the  following  form: 


x 

V 


z 


s 

c 


The  input  wires  x,  y  and  2  each  have  the  value  0  or  1,  and  sum  up  to  either 
0,  1,  2  or  3,  which  is  reflected  in  the  output  wires  s  and  c. 

The  sum  bit  s  will  be  1  if  exactly  one  or  all  three  of  the  input  bits  x,  y 
and  2  are  1: 
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s  =  xy'z'  +  x'yz'  +  x'y'z  +  xyz\ 
and  the  carry  bit  will  be  1  if  at  least  two  of  the  input  bits  are  1: 
c  =  xyz'  +  xy' z  +  x'yz  +  xyz. 

Letting  t  =  yz'+y'z  be  the  value  of  the  sum  bit  from  a  half  adder  with 
inputs  y  and  z,  and  noting  from  Exercise  3.10(1)  that  t'  =  yz+y'z' ,  we  can 
note  that 

s  =  xy'z'  +  x'yz'  +  x'y'z  +  xyz 

=  x' (yz'+y'z)  +  x(yz+y'z')  (commutativity /distributivity) 

=  x't  +  xt' 


and 


c  =  xyz'  +  xy'  z  +  x'yz  +  xyz 
=  x(yz'+y'z)  +  yz(x+x')  (distributivity) 
=  xt  +  yz  (identity) 


These  outputs  are  generated  by  combining  two  half  adders  and  an  OR  gate 
as  follows: 


x 

y 

z 


s  =  xt'  +  x't 


c  =  xt  +  yz 


3.5.5  Putting  It  All  Together 


Having  defined  a  full  adder,  adding  two  n-bit  numbers  is  then  achieved  by 
stringing  n  such  full  adders  together.  In  particular,  to  build  our  4-bit  adder, 
which  adds  together  the  two  4-bit  binary  numbers  a3a2a1a0  and  b3b2b1b0  to 
produce  the  5-bit  binary  number  s4s3s2SiSo,  we  would  use  the  following 
circuit: 


£Zo 
ffli 
a  2 
a  3 

b0 

bi 

b2 

h 


so 

si 

s2 

S3 
S  4 


We  start  with  a  half  adder,  as  we  don’t  have  to  worry  about  a  carry  bit  for 
the  first  two  bits  a0  and  b0.  Of  course,  stringing  more  full  adders  together 
would  allow  larger  values  to  be  added,  meaning  that  this  circuit  can  be  easily 
scaled  up. 
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Additional  Exercises 


1.  Prove  the  second  Further  Distributive  Law,  the  second  Idempotence 
Law,  the  second  Absorption  Law,  and  the  second  Domination  Law, 
all  using  the  Laws  of  Boolean  Algebra.  (That  is,  do  not  rely  on  the 
Duality  Principle.) 

2.  Prove  that  the  set  S  =  {1,2,5, 10}  of  divisors  of  10  is  a  Boolean  algebra 
with  zero  1  and  unit  10,  with  x  +  y  interpreted  as  the  least  common 
multiple  of  x  and  y,  lcm(a:,  y)\  xy  interpreted  as  the  greatest  common 
factor  of  x  and  y,  gcd(a:,  y);  and  x'  =  10/a;. 

3.  Prove  that  if  we  take  5  =  {1,  2,  3,  6, 12}  to  be  the  set  of  divisors  of  12 
in  Exercise  2  above,  then  we  would  not  get  a  Boolean  algebra. 

4.  Does  the  finite  powerset  ( U )  of  a  set  U  give  rise  to  a  Boolean 

algebra  (with,  as  usual,  the  roles  of  0,  1,  +,  x  and  •'  taken  by  0,  U,  U, 
n  and  7,  respectively)?  Justify  your  answer. 

5.  (a)  Prove  that  xy1  =  0  if,  and  only  if,  x'  +  y  =  1. 

(b)  State  and  prove  the  dual  of  the  theorem  in  part  (a). 

6.  (a)  Prove  that  x  =  y  if,  and  only  if,  xy'  +  x'y  =  0. 

(b)  State  and  prove  the  dual  of  the  theorem  in  part  (a). 

7.  The  NAND  gate  has  the  following  definition  (and  symbol): 


X 

y 

x  I  y 

0 

0 

1 

0 

l 

1 

l 

0 

1 

l 

l 

0 

x  |  y 


That  is,  the  output  z  —  x  \  y  has  the  value  0  if  both  of  the  inputs  x 
or  y  have  the  value  1;  otherwise  it  has  the  value  1. 


(a)  Build  a  circuit  using  AND,  OR  and  NOT  gates  which  implements 
this  operator. 

(b)  Show  how  to  build  circuits  for  computing  x' ,  x+y  and  xy  only 
using  NAND  gates. 


8.  The  NOR  gate  has  the  following  definition  (and  symbol): 


X 

y 

x  i  y 

0 

0 

l 

0 

l 

0 

1 

0 

0 

1 

l 

0 

x 

y 


z  =  x  i  y 
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That  is,  the  output  z  =  x  i  y  has  the  value  1  if  neither  of  the  inputs 
x  nor  y  has  the  value  1;  otherwise  it  has  the  value  0. 

(a)  Build  a  circuit  using  AND,  OR  and  NOT  gates  which  implements 
this  operator. 

(b)  Show  how  to  build  circuits  for  computing  x\  x+y  and  xy  only 
using  NOR  gates. 

9.  Build  circuits  which  implement  the  following  Boolean  expressions. 

(a)  (a  +  b)(b  +  c) 

(b)  a'b+  (b  +  c)' 

(c)  (aft)'  +  (be)' 

10.  Describe  the  behaviour  of  the  following  circuits  by  providing  a  Boolean 
expression  and  a  truth  table  defining  the  output  value  X. 

(a) 


(b) 


11.  Joel,  Felix  and  Oskar  are  using  a  simple  voting  machine  to  cast  se¬ 
cret  ballots  to  decide  which  DVD  to  watch  tonight,  the  choice  being 
between  the  latest  Final  Destination  film  and  the  new  Fockers  film. 
Each  of  them  will  vote  either  “0”  for  Final  Destination  or  “1”  for  the 
Fockers;  and  they  will  then  watch  whichever  film  receives  the  majority 
of  the  three  votes. 

Build  a  circuit  which  accepts  three  inputs  J,  F  and  O  representing 
their  respective  votes,  and  produces  one  output  X  representing  the 
outcome  of  the  election. 

12.  A  multiplexer  is  a  circuit  with  three  input  lines  x0,  xx  and  s,  and 
one  output  line  r,  defined  as  follows: 
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Xfl 
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s 
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Xi 

r 
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0 

0 

1 

0 

0 
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1 
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1 
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1 
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0 
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0 

1 

1 

l 

1 
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1 

1 

1 

The  s  line  acts  as  a  selector-,  the  value  of  the  output  r  will  be  either 
that  of  x0  or  that  of  x1}  depending  on  the  value  of  s. 

Build  a  circuit  which  implements  this  multiplexer. 

(Hint:  First  argue  that  r  =  s'x0  +  sa^.) 


Chapter  4 

Predicate  Logic 


Death  is  more  universal  than  life;  everyone  dies  but  not  everyone 
lives. 

-  Andrew  Sachs. 

Propositional  logic  allows  us  to  express  and  reason  about  simple  proposi¬ 
tions.  However,  we  quickly  run  into  its  limitations.  For  example,  Augustus 
De  Morgan  put  made  the  following  deduction: 

All  horses  are  animals. 

Therefore,  all  horse-heads  are  animal-heads. 

This  deduction  is  certainly  valid.  However,  this  cannot  be  demonstrated 
using  propositional  logic,  as  there  is  no  way  to  discuss  the  properties  of 
individual  horses  or  animals,  let  alone  their  heads. 

In  this  section  we  extend  propositional  logic  to  include  predicates  - 
properties  which  may  be  true  or  false  of  particular  elements  in  a  given  uni¬ 
verse  -  and  quantifiers  -  the  means  by  which  we  refer  to  elements  which 
satisfy  such  properties. 


Predicates  and  Free  Variables 


Recall  how  we  defined  the  set  of  prime  numbers: 

{ x  :  x  is  a  prime  number}. 

We  used  this  example  to  introduce  the  general  scheme  for  defining  sets  as 
the  collection  of  all  objects  which  satisfy  some  property: 

{  x  :  x  has  property  P  } 

denotes  the  set  of  all  objects  x  which  satisfy  the  property  P .  Such  a  property 
is  referred  to  as  a  predicate,  and  we  write  P(x)  to  say  that  “the  object  x 
has  property  P.”  A  predicate  is  an  indeterminate  proposition  which  is  true 
or  false  of  any  particular  element  a;  of  a  given  universe.  Thus,  for  example, 
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Prime(x )  =  “x  is  a  prime  number  ” 

denotes  the  predicate  which  stipulates  that  the  element  a;  is  a  prime  number; 
the  universe  of  discourse  in  this  instance,  that  is,  the  set  of  values  which 
x  may  range  over,  would  most  naturally  be  the  set  of  natural  numbers  N 
(though  it  could  be  anything;  in  the  case  that  x  were  not  a  natural  number, 
the  predicate  Pnmefx)  would  be  false). 

Predicates  differ  from  propositions  in  that  they  do  not  have  a  fixed  truth 
value,  since  we  do  not  know  the  value  of  the  object  to  which  it  refers: 
Prime(x)  may  be  true  or  false,  depending  on  what  value  x  refers  to.  The 
variable  x  is  referred  to  as  a  free  variable.  If  we  instantiate  the  free  variable 
in  such  a  predicate,  we  would  get  a  proposition.  For  example,  Prime{ 7)  is 
a  true  proposition  (7  is  a  prime  number),  while  Prime(Q)  is  a  false  propo¬ 
sition  (9  =  3-3  is  not  a  prime  number).  The  set  of  objects  which  satisfy  a 
predicate,  that  is,  which  make  the  predicate  true,  is  called  the  truth  set  of 
the  predicate.  Thus,  for  example,  the  truth  set  of  the  predicate  Primelx ) 
is  the  set  of  prime  numbers.  When  we  define  a  set  by  {a:  :  P(x)  },  we  are 
defining  it  to  be  the  truth  set  of  the  predicate  P(x). 


Example  4.1  j _ 

Let  the  universe  of  discourse  be  the  Duck  family: 

Ducks  =  j  Quackmore,  Hortense,  Scrooge, 

Donald,  Della,  Huey,  Louis,  Dewey}, 
and  define  the  following  predicate: 

Female(x)  =  “x  is  a  female.” 

Then 

•  F'emate(Hortense)  and  Female( Della)  are  both  true; 

•  i?emale(Quackmore),  Female(Sa:ooge),  Female{ Huey), 
Female{ Louis)  and  Female( Dewey)  are  all  false; 

•  the  truth  set  of  the  predicate  Female(x)  is  {  Hortense,  Della}. 


Predicates  may  range  over  more  than  one  element.  As  familiar  examples, 
equality  and  set  inclusion  are  predicates  that  range  over  two  elements.  In 
these  cases,  infix  notation  “x  =  y ”  and  “x  C  y"  is  more  natural  to  use  than 
prefix  notation  “=(x,y)”  and  “C (x,y).”  The  statement  5  =  5,  for  example, 
is  true,  whereas  the  statement  {0}  =  0  is  false. 

The  truth  set  of  a  predicate  which  ranges  over  more  than  one  element 
consists  of  tuples  of  values;  the  number  of  coordinates  of  the  tuples  is  equal 
to  the  number  of  free  variables  in  the  predicate.  The  tuples  in  the  truth 
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set  represent  those  values  that  we  can  instantiate  the  free  variables  with  in 
order  to  turn  the  predicate  into  a  true  proposition. 

(^Example  4.2^ _ 

We  may  use  Divides(x,  y )  to  denote  the  two-place  predicate  over  integers 
which  stipulates  that  x  divides  evenly  into  y.  In  this  case, 

Divides^,  15) 

is  true,  since  3  divides  evenly  into  15  (5  times),  while 
Divides(  4, 15) 

is  false,  since  4  does  not  divide  evenly  into  15.  The  truth  set  of  the  predicate 
Divides  is  the  set  of  pairs  (x,  y)  such  that  x  divides  evenly  into  y: 

{  (x,  y)  :  x  divides  evenly  into  y  }. 

The  standard  mathematical  symbol  for  this  predicate  is  |  and  is  written  in 
infix  notation,  as  in  3  |  15  and  4  \  15. 

(^Exercise  4.2^)  (Solution  on  page  424) _ 

What  are  the  truth  sets  of  the  following  predicates? 

1.  Even{x)  =  “x  is  an  even  integer.” 

2.  EvenPrime(x)  =  “x  is  an  even  prime  number.” 

3.  DeadlySin(x)  —  “x  is  a  deadly  sin.” 

4.  Sum(x,  y,  z )  =  “x,  y  and  z  are  integers,  and  x  +  y  =  z.” 

5.  Sum(u,  5,  v ),  where  Sum(x,  y,  z)  is  the  predicate  defined  above. 
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Before  Joel,  Felix,  Oskar  and  Amanda  go  to  school  in  the  morning,  they 
have  to  remember  to  brush  their  teeth;  that  is,  the  predicate 

Teeth(x), 

which  denotes  that  child  x  has  brushed  their  teeth,  must  be  true  of  each 
of  them.  To  this  end,  each  child  is  asked  in  turn  if  they  have  brushed  their 
teeth,  in  order  to  ensure  that  the  compound  proposition 


Teeth( Joel)  A  Teefh(Felix)  A  Teefh(Oskar)  A  Teefh(Amanda) 
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is  true.  The  universe  of  discourse  is  the  set  consisting  of  the  four  children: 

Children  =  {Joel,  Felix,  Oskar,  Amanda}. 

After  this  final  check,  they  get  into  the  car  and  head  off  to  school.  One 
of  the  children  has  to  sit  in  the  front  passenger’s  seat,  as  there  is  only  room 
for  three  passengers  in  the  back  seat.  Thus,  the  predicate 

Front(x), 

which  denotes  that  child  x  sits  in  the  front  seat,  must  be  true  of  some  one 
of  them.  They  regularly  argue  over  who  this  will  be  -  either  for,  if  they 
want  to  get  away  from  their  siblings,  or  against,  to  continue  a  joint  activity 
-  but  the  compound  proposition 

Front(Joel)  V  FVonf(Felix)  V  Front( Oskar)  V  Front( Amanda) 
must  somehow  be  true. 

In  fact,  as  there  is  only  room  for  one  child  in  the  front  seat,  the  predicate 
Front{x )  must  be  true  of  exactly  one  child;  that  is,  it  must  be  true  of  one 
and  false  of  all  of  the  others.  This  means  that  the  following  proposition 
must  be  true: 

(  Front( Joel)  A  ^Front( Felix)  A  ^.Fronf  (Oskar)  A  ^Front{ Amanda)) 

V 

(^FVont(Joel)  A  FVont(Felix)  A  ~^Front( Oskar)  A  ^FVont(Amanda)) 

V 

(^Front( Joel)  A  ^Front( Felix)  A  i'Vonf(Oskar)  A  ^FVont(Amanda)) 

V 

(^FVont(Joel)  A  ^Front( Felix)  A  ~^Front( Oskar)  A  FVont(Amanda)) 

That  is:  either  Joel  sits  in  the  front  seat  and  none  of  the  others  do;  or  Felix 
sits  in  the  front  seat  and  none  of  the  others  do;  or  Oskar  sits  in  the  front 
seat  and  none  of  the  others  do;  or  Amanda  sits  in  the  front  seat  and  none 
of  the  others  do. 

These  propositions  are  lengthy  already  when  there  are  only  four  elements 
in  the  universe  of  discourse.  Furthermore,  we  would  not  be  able  to  write  out 
formulae  to  check  if  some  or  all  elements  satisfy  a  property  if  the  universe 
of  discourse  is  infinite.  For  example,  to  express  the  fact  that  every  prime 
number  greater  than  2  is  odd,  using  the  predicate  Odd{x)  to  mean  that  x 
is  an  odd  number,  would  require  an  infinitely-long  conjunction: 

Odd( 3)  A  Odd( 5)  A  Odd{7)  A  Odd(ll)  A  Odd(  13)  A 

Similarly,  to  express  the  statement  that  some  primes  are  square,  using  the 
predicate  Square(x)  to  mean  that  a:  is  a  prefect  square,  would  require  an 
infinitely-long  disjunction: 
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Square^ 2)  V  Square^ 3)  V  Square( 5)  V  Square( 7)  V  ■  ■  ■ 

However,  we  cannot  express  infinite  conjunctions  and  disjunctions  in  propo¬ 
sitional  logic. 

Predicate  logic  provides  two  forms  of  quantification  which  allow  you  to 
express  when  properties  are  true  of  all  elements  in  the  universe  of  discourse, 
or  true  of  at  least  some  elements  in  the  universe.  These  are  outlined  as 
follows. 

4.2.1  Universal  Quantification 

When  we  want  to  express  that  a  predicate  P(x )  is  true  of  all  elements  x  of 
the  universe  of  discourse,  we  can  write: 

Va;  P(x) 

which  is  pronounced  as 
“for  all  x,  P(x) 

it  is  true  if,  and  only  if,  the  predicate  P(x)  is  true  of  all  possible  values  of  x. 
This  is  called  universal  quantification. 

For  example,  instead  of  writing 

Teeth( Joel)  A  Teeth{ Felix)  A  Teeth( Oskar)  A  Teefh(Amanda) 
to  express  that  Teethfx)  is  true  of  all  four  children,  we  can  simply  write 
Va;  Teeth(x) 

which  says  the  same  thing,  that  everyone  has  brushed  their  teeth  (assuming 
the  universe  of  discourse  is  the  set  of  the  four  children). 

Notice  that  Va;  Teethfx)  is  a  proposition:  it  has  a  definite  truth  value. 
The  variable  x  is  not  a  free  variable  in  this  case;  it  is  a  bound  variable',  it 
is  bound  by  the  quantifier  “Va;”. 


The  statement 


“Nobody  did  the  homework” 
is  expressed  as: 

Va;  -^H(x) 

where  H(x)  =  “x  did  the  homework” . 
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The  universe  of  discourse  is  (assumed  to  be)  the  set  of  students  who  were 
assigned  the  homework  to  do. 

Notice  that  saying  something  is  true  of  nobody  is  a  universal  quantifica¬ 
tion:  it  is  the  same  as  saying  that  this  something  is  not  true  of  everybody. 
In  this  case,  we  are  saying  that  everybody  did  not  do  their  homework. 


Example  4.4J 

The  statement 

"Every  dog  that  has  stayed  in  the  kennel  will  have  to  go  into  quar¬ 
antine" 

is  expressed  as: 

Vx(K(x)  =>  Q(x)) 

where  K(x )  =  “x  has  stayed  in  the  kennel” 

Q(x)  =  "x  will  have  to  go  into  quarantine” . 

The  universe  of  discourse  is  (assumed  to  be)  the  set  of  dogs,  only  some  of 
which  have  stayed  in  the  kennel  in  question. 

This  example  demonstrates  how  to  quantify  universally  over  a  subset 
of  the  universe  of  discourse:  we  simply  stipulate  that  a  property  holds  of 
something  whenever  it  is  a  member  of  the  subset  of  interest  (that  is,  if 
it  satisfies  the  predicate  defining  this  subset).  In  this  case,  by  using  the 
implication 

K(x)  =>  Q(x) 

we  are  not  stating  that  every  dog  will  have  to  go  into  quarantine,  but  only 
those  dogs  that  have  stayed  in  the  kennel.  If  a  particular  dog  x  has  not 
stayed  in  the  kennel  -  that  is,  if  K(x)  is  not  true  -  then  that  dog  x  need  not 
go  into  quarantine  -  that  is,  Q(x)  need  not  be  true.  (Of  course,  this  dog  x 
might  have  to  go  into  quarantine  for  some  other  reason;  it  is  not  necessarily 
the  case  that  Q[x)  is  false.) 


Note  that  universal  quantification  is  assumed  to  bind  more  strongly  than 
all  of  the  propositional  connectives;  that  is,  it  is  given  higher  precedence. 
For  example,  in  the  above  example  we  wrote 

\/x{K{x)  =>  Q(x)) 

and  not 

\/x  K(x)  =s>  Q(x) 

as  the  latter  would  be  interpreted  as 
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( Va;  K(x) )  =4>  Q( x) 

which  says  “if  every  dog  has  stayed  in  the  kennel  then  x  will  have  to  go 
into  quarantine.  ”  This  is  certainly  not  what  is  intended;  in  particular,  it 
is  a  predicate  with  a  free  variable  x  -  appearing  in  Q(x)  -  and  is  therefore 
not  a  proposition. 


The  statement 


“Nobody  likes  a  sore  loser" 
is  expressed  as: 

Va;  (S(x)  =4>  My  ~'L(y,  x) ) 

where  S(a;)  =  “x  is  a  sore  loser" 

L(y,x )  =  “y  likes  x” . 

The  universe  of  discourse  is  (assumed  to  be)  the  collection  of  all  people. 

This  proposition  is  saying  the  following  is  true  of  every  person  x:  if  x  is 
a  sore  loser,  then  every  person  y  does  not  like  x. 


Exercise  4.5  )  (Solution  on  page  424) 


Using  the  predicates 


B(x)  =  “x  is  a  bee" 
F(x)  =  “x  is  a  flower” 


L{x,y )  =  “x  likes  y” 


write  each  of  the  following  statements  in  predicate  logic. 


1.  All  bees  like  all  flowers 

2.  Bees  only  like  flowers. 

3.  Only  bees  like  flowers. 


4.2.2  Existential  Quantification 

When  we  want  to  express  that  a  predicate  P(x)  is  true  of  at  least  some 
element  x  of  the  universe  of  discourse,  we  can  write: 

3x  P(x) 


which  is  pronounced  as 
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“there  exists  x  such  that  P(x)”; 

it  is  true  if,  and  only  if,  the  predicate  P(x)  is  true  of  some  value  of  x.  For 
example,  instead  of  writing 

Pront(Joel)  V  FVonf(Felix)  V  FVont(Oskar)  V  Front{ Amanda) 

to  express  that  Front(x)  is  true  of  at  least  one  of  the  four  children,  we  can 
simply  write 

3a:  Front{x) 

which  says  the  same  thing,  that  someone  sits  in  the  front  seat  (again,  as¬ 
suming  the  universe  of  discourse  is  the  set  of  the  four  children). 

Again,  the  variable  x  in  3a:  Front{x)  is  a  bound  variable,  bound  by  the 
quantifier  “3a;”;  and  like  universal  quantification,  existential  quantification 
is  assumed  to  bind  more  strongly  than  all  of  the  propositional  connectives. 


The  statement 


“Someone  didn’t  do  the  homework” 
is  expressed  as: 

3a;  -^H(x) 

where  H{x)  =  “x  did  the  homework” . 

The  universe  of  discourse  is  again  (assumed  to  be)  the  set  of  students  who 
were  assigned  the  homework  to  do. 

This  proposition  states  that  -iH(x)  holds  of  some  student:  perhaps  no 
one  did  the  homework  (as  expressed  by  the  proposition  given  in  Exam¬ 
ple  4.3);  or  perhaps  several  did  the  homework  while  several  others  didn’t; 
or  perhaps  all  but  one  person  did  the  homework.  This  proposition  doesn’t 
distinguish  between  these  possibilities;  it  merely  notes  that  at  least  one  ele¬ 
ment  of  the  universe  of  discourse  satisfies  the  predicate,  that  is,  at  least  one 
person  did  not  do  the  homework. 


The  statement 


“If  some  dog  that  has  stayed  in  the  kennel  has  been  in  contact  with 
a  dog  with  rabies,  then  every  dog  that  has  stayed  in  the  kennel  will 
have  to  go  into  quarantine” 


is  expressed  as: 
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3x(K{x)  A  By(C(x,y)AR(y)))  =>  Vx  (  K  (x)  =*  Q(x)  ) 
where  K(x)  =  “x  has  stayed  in  the  kennel” 

R(x)  =  “x  has  rabies” 

C(x,  y )  =  “x  and  y  have  been  in  contact” 

Q(x)  =  “x  will  have  to  go  into  quarantine” . 


Exercise  4.7)  (Solution  on  page  424) 


Assuming  the  universe  of  discourse  is  the  set  of  human  beings,  consider  the 
following  predicates 


Male(x)  =  “x  is  male” 
Female{x)  =  “x  is  female” 


Parent(x,y )  =  “x  is  a  parent  of  y” 
Father(x,y )  =  “x  is  the  father  of  y" 
Mother(x,y )  =  “x  is  the  mother  of  y” 


Sibling(x,y)  ==  “x  and  y  are  siblings” 
Cousin{x,y)  =  “x  and  y  are  cousins" 


Using  these  predicates,  express  the  following  properties  in  predicate  logic. 

1.  Every  human  is  either  male  or  female,  but  no  human  is  both. 

2.  Mothers  are  female  parents. 

3.  Every  human  has  exactly  one  mother  and  exactly  one  father. 

4.  Siblings  have  the  same  parents. 

5.  Cousins  each  have  a  parent  who  are  siblings. 


^Exercise  4.8 j  (Solution  on  page  425) _ 

Using  the  following  predicates: 

Horse(h)  =  “h  is  a  horse’1 


Ammal(a)  =  “a  is  an  animal” 

Head(x,y )  =  “x  is  the  head  of  y” 
formalise  the  following  argument  in  predicate  logic: 


All  horses  are  animals. 


Therefore,  all  horse  heads  are  animal  heads. 


Explain  why  the  argument  is  valid. 
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4.2.3  Bounded  Quantifications 

There  are  two  forms  of  bounded  quantification  which  we  use  for  conve¬ 
nience.  These  restrict  the  range  of  the  variables  being  quantified. 

Firstly,  to  declare  that  the  predicate  P(x)  is  true  of  every  element  of  the 
set  A,  we  write 

V xeA  P(x) 

which  is  pronounced  as 

“for  all  values  x  in  A,  P(x) 

This  is  logically  equivalent  to 
Vx  (^xeA  =>  P(x)^j. 

Similarly,  to  declare  that  the  predicate  P(x)  is  true  of  some  element  of  the 
set  A,  we  write 

3  xeA  P(x) 

which  is  pronounced  as 

“there  is  some  value  x  in  A  such  that  P(x)”. 

This  is  logically  equivalent  to 
3x  ^  xeA  A  P(x)  j . 

One  further  useful  restriction  for  the  existential  quantifier  is  declare  that 
exactly  one  value  x  satisfies  P(x).  This  is  written 

3 \xP{x) 

which  is  pronounced  as 

“there  is  exactly  one  value  x  such  that  P(x) 

This  is  logically  equivalent  to 

3x  ^  P(x)  A  -.3 y  ( P{y )  A  y  f  x)  j  . 

This  says  that  there  is  a  value  x  such  that  P{x),  but  there  is  not  a  different 
value  yfx  such  that  P(y).  For  example,  if  the  predicate  Front(x)  denotes 
that  child  x  sits  in  the  front  seat  of  the  car,  where,  again,  the  universe  of 
discourse  is  the  set  of  the  four  children,  then 

3!a;  Front(x) 
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states  that  exactly  one  of  the  children  sits  in  the  front  seat. 

Note  that  you  can  combine  the  last  two  constructions  to  declare  that 
exactly  one  value  from  a  set  A  satisfies  P(x):  3!rc  e  AP(x).  Also  note  that 
x  is  of  course  bound  by  the  quantifiers  in  each  case. 

(^Example  4.8) _ 

You  may  be  aware  that  \/2  is  irrational:  that  it  cannot  be  expressed  as  a 
fraction  p/q.  (We  shall  justify  this  claim  in  Example  5.6,  page  139.)  In  fact, 
any  nonnegative  integer  is  either  a  perfect  square,  such  as  25  =  52,  or  its 
square  root  is  irrational.  We  can  express  this  fact  as  follows: 

Vn  e  Z(3fc  e  Z(n=  k2)  V  E  Q  (n  =  q2) ). 

This  says  that  for  all  integers  n,  either  there  exists  another  integer  k  such 
that  n  —  k2  (that  is,  n  is  a  perfect  square  with  square  root  k),  or  there  does 
not  exist  a  rational  number  q  such  that  n  =  q2  (that  is,  it  does  not  have  a 
rational  square  root). 


^Example  4.9^ 


Recall  the  following  puzzle  from  Exercise  1.16  (page  35).  Joel,  Felix  and 
Oskar  each  write  their  name  on  a  piece  of  paper,  and  then  exchange  the 
pieces  of  paper  so  that  no  one  has  the  piece  with  their  own  name  on  it. 
They  then  hold  these  pieces  of  paper  so  that  Amanda  can’t  see  what’s  on 
them,  but  tell  her  that  each  has  the  name  of  one  of  the  others,  and  they 
challenge  her  to  figure  out  who  is  holding  each  name.  She  is  allowed  to  look 
at  the  name  written  on  any  one  piece  of  paper.  She  decides  to  look  at  Joel’s 
piece,  and  finds  “Oskar”  written  on  it. 

Let  Boys  =  { Joel,  Felix,  Oskar }  be  the  set  of  three  boys;  and  let 
Papers  =  {  J,  F,  O  }  be  the  set  of  three  pieces  of  paper  with  names  written 
on  them:  J  is  the  piece  with  “Joel”  written  on  it;  F  is  the  piece  with  “Felix” 
written  on  it;  and  O  is  the  piece  with  “Oskar”  written  on  it.  Furthermore, 
let  Holds(b,p )  be  the  predicate  which  says  that  boy  b  holds  the  piece  of 
paper  p.  Then  we  can  formulate  the  conditions  describe  in  this  problem  as 
follows: 


1.  Each  boy  holds  precisely  one  piece  of  paper: 

V6  e  Boys  3!p  e  Papers  Holds(b,p). 

2.  Each  piece  of  paper  is  held  by  precisely  one  boy: 

Vp  e  Papers  3!6  e  Boys  Holds(b,p). 


3.  No  piece  of  paper  is  being  held  by  the  boy  whose  name  is  on  the  paper: 
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^Holds( Joel,  J)  A  ~^Holds( Felix,  F)  A  ^Holds( Oskar,  O). 
4.  Joel’s  piece  of  paper  has  “Oskar”  written  on  it: 

Holds( Joel,  O). 


(^Exercise  4.9J  (Solution  on  page  425) 

Let  T(s,  c)  stand  for  the  predicate  “student  s  takes  course  c.  ”  Express  the 
following  statements  in  predicate  logic. 

1.  Alice  and  Bob  take  exactly  one  course  together.” 

2.  Alice  and  Bob  take  exactly  two  courses  together.” 
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If  it  is  not  the  case  that  the  predicate  P(x )  is  true  for  all  values  of  x,  then 
this  must  mean  that  P(x)  is  not  true  for  some  value  of  x\  that  is, 

-Mx  P(x)  o  3 x^P(x). 

Equally,  if  it  is  not  the  case  that  the  predicate  P(x)  is  true  for  some  value 
of  x,  then  this  must  mean  that  P(x)  is  not  true  for  all  values  of  x\  that  is, 

^3 xP(x)  o  \/x^P(x). 


These  two  laws  coincide  with  De  Morgan’s  Laws: 

n(PAQ)  o  ^P  V 
-i  (P  V  Q)  o  nPAnQ 


if  we  consider  universal  quantification  as  a  (potentially)  infinite  conjunction, 
and  existential  quantification  as  a  (potentially)  infinite  disjunction.  Suppose 
that  the  universe  of  discourse  is  U  =  {  a,  b,  c,  . . .  }.  Then 

-NxP(x)  o  -.(P(a)  A  P{b )  A  P(c)  A  •••) 

O  ^P(a)  V  ~^P(b)  V  ^P(c)  V  •••  (De  Morgan’s  Law) 
O  3a:^P(a;); 


^3xP(x)  o  -n(P(a)  V  P(b )  V  P(c)  V  •  •  • ) 

O  ^P(a)  A  -i P(b )  A  -iP(c)  A  •••  (De  Morgan’s  Law) 
O  Vx^P(x). 


and 
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Example  4.10 J _ 

Recall  from  the  Example  in  Section  4.2  that  Joel,  Felix,  Oskar  and  Amanda 
must  all  brush  their  teeth  before  going  to  school  in  the  morning;  that  is, 
that  the  proposition 

Va;  Teeth(x ) 

is  true,  where  -  as  before  -  we  use  Teeth(x)  to  denote  the  statement  that 
child  x  has  brushed  their  teeth,  and  we  continue  to  take  the  universe  of 
discourse  to  consist  of  the  set  of  four  children  in  question: 

Children  =  {Joel,  Felix,  Oskar,  Amanda}. 

On  a  particular  day,  it  may  be  discovered  that  this  statement  is  not  true. 
For  example,  perhaps  Joel,  Oskar  and  Amanda  have  all  brushed  their  teeth, 
but  Felix  has  not.  This  is  the  reason  that  Va:  Teeth{x)  is  false,  i.e. ,  that 

— iVa:  Teeth(x) 

is  true:  that  there  is  someone  (namely  Felix)  who  has  not  brushed  their 
teeth: 


3rc  -i  Teeth(x) 

This  is  an  example  of  the  general  law  that 
-NxP(x)  3 x^P(x). 

We  have  also  earlier  noted  that,  when  driving  to  school,  one  of  the  chil¬ 
dren  must  sit  in  the  front  seat  of  the  car:  that  is,  that  the  statement 

3x  front(x) 

must  be  true,  where  -  as  before  -  we  use  front(x)  to  denote  the  statement 
that  child  x  sits  in  the  front  seat.  For  this  statement  to  be  false,  it  would 
have  to  mean  that  none  of  the  children  are  sitting  in  the  front  seat,  or  in 
other  words  that  all  of  them  are  not  sitting  in  the  front  seat: 

Va;  ~^front(x). 

This  is  an  example  of  the  general  law  that 
^3a;P(a:)  o  \/x^P(x). 


(^Exercise  4.10J  (Solution  on  page  425) _ 

For  each  of  the  following  statements,  identify  which  of  the  options  provided 
correctly  expresses  its  negation.  Translate  each  statement  into  predicate 
logic  to  confirm  your  choices. 
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1.  Some  people  like  mathematics. 

(a)  Some  people  dislike  mathematics. 

(b)  Everybody  dislikes  mathematics. 

(c)  Everybody  likes  mathematics. 

2.  All  cats  have  fur  and  a  tail. 

(a)  No  cat  has  fur  and  a  tail. 

(b)  Some  cats  are  bald  and  tailless. 

(c)  Some  cats  are  bald  or  tailless. 

3.  Everyone  who  had  not  been  vaccinated  got  sick. 

(a)  Everyone  who  had  been  vaccinated  did  not  get  sick. 

(b)  Some  people  who  had  been  vaccinated  got  sick. 

(c)  Some  people  who  had  not  been  vaccinated  did  not  get  sick. 


Having  established  how  quantifiers  interact  with  negation,  we  next  con¬ 
sider  how  they  interact  with  conjunction  and  disjunction.  Specifically,  we 
may  wonder  which  of  the  following  is  true: 


1.  \/x(P(x)  A  Q(x)) 

2.  3x(P(x)  A  Q(x)) 

3.  \/x(P(x)  V  Q(x)) 

4.  3x(P(x)  V  Q(x)) 


\/xP(x)  A  \/xQ(x). 
O  3xP(x)  A  3xQ(x). 
O  MxP(x)  V  MxQ(x). 
O  3 xP{x)  V  3a :Q(x). 


We  carefully  consider  each  of  these  in  turn. 


1.  This  property  is  valid. 

If  P(x)  A  Q jxj  is  true  of  every  object  a:,  then  certainly  P(x^  must  be 
true  of  every  object  x  and  Q(x)  must  be  true  of  every  object  x. 

Equally,  if  P(x)  is  true  of  every  object  x  and  Q(x )  is  true  of  every 
object  x,  then  P(x)  A  Q(x)  must  be  true  of  every  object  x. 

2.  This  property  is  not  valid. 

If  P(x)  AQ(x)  is  true  of  some  object  x,  then  P(x)  must  be  true  of  that 
object  x  and  Q(x)  must  be  true  of  that  object  x. 

However,  P(x)  may  be  true  of  some  object,  and  Q(x)  may  be  true  of 
some  different  object,  while  P(x)  A  Q(x)  may  never  be  true  of  the 
same  object  x. 

For  example,  it  is  true  that  prime  numbers  and  perfect  squares  exist: 


3a:  Pnme(x)  A  3a:  Square(x)  is  true. 
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For  instance  Prime(  17)  is  true  and  Square( 25)  is  true.  However,  no 
number  can  be  both  prime  and  a  perfect  square  at  the  same  time: 

3 x(Prime(x)  A  Square(x ))  is  false. 

We  have,  however,  established  the  weaker  property: 

2'.  3 x(P(x)  A  Q(x))  =4  3 xP(x)  A  3 xQ(x). 

3.  This  property  is  not  valid. 

If  P(x)  is  true  for  all  objects  x,  then  certainly  P(x)  V  Q(x)  must  be 
true  of  all  objects  x\  Equally,  if  Q(x)  is  true  for  all  objects  x,  then 
P(x)  V  Q(x)  must  be  true  of  all  objects  x. 

However,  P(x)  V  Q(x)  may  be  true  of  all  objects  x  without  it  being 
the  case  that  P(x)  is  true  of  all  objects  x,  nor  that  Q(x)  is  true  of  all 
objects  x. 

For  example,  it  is  true  that  all  integers  are  either  even  or  odd: 

Vx  (Even(x)  V  Odd(x))  is  true. 

However,  not  every  integer  is  even,  and  not  every  integer  is  odd: 

Vx  Even(x)  V  Vx  Odd(x)  is  false. 

We  have,  however,  established  the  weaker  property: 

3'.  Vx(P(x)  V  Q(x))  4=  VxP(x')  V  VxQ(x). 

4.  This  property  is  valid. 

If  P(x )  V  Q(x )  is  true  of  some  object  x,  then  either  P[x)  must  be  true 
of  that  object  x  or  Q(x)  must  be  true  of  that  object  x. 

Equally,  if  P(x)  is  true  of  some  object  x  or  Q(x)  is  true  of  some  object 
x,  then  P(x)  A  Q(x)  must  be  true  of  that  object  x. 

As  a  final  note,  the  following  are  clearly  valid  properties: 

1.  \/xVy  P(x,y)  o  'dyMx  P(x,y)\ 

2.  3a;  3 yP(x,y)  44  3r/3 xP(x,y). 

That  is,  we  can  rearrange  the  order  in  which  universal  quantifications  are 
applied,  as  well  as  the  order  in  which  existential  quantifications  are  applied. 
It  is  common  practice  to  write  these  as  Vx,y  P(x,y)  and  3 x,y  P(x,y),  re¬ 
spectively.  However,  as  we  see  in  the  following  example,  we  cannot  rearrange 
different  quantifiers: 


Va;3 yP(x,y)  ^4  3 y'dx  P(x,y). 
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Example  4.11J _ 

A  certain  mathematics  textbook  has  an  exercise  which  asks  its  reader  to 
translate  the  following  sentence  into  predicate  logic: 

"Every  real  number  is  smaller  than  some  integer.  ” 

This  informal  English  sentence  can  be  interpreted  in  (at  least)  the  following 
two  different  ways: 

1.  VreR  3 nGZ  (r  <  n) 

Given  any  real  number  r,  we  can  find  a  larger  integer  n. 

2.  3  nGZ  VreR  (r  <  n) 

There  is  an  integer  which  is  larger  than  every  real  number. 

The  first  of  these  statements  is  true  -  and  is  undoubtedly  the  interpretation 
intended  by  the  author  -  while  the  second  statement  is  blatantly  false.  The 
author  of  this  mathematics  textbook  was  trying  to  state  a  basic  fact  about 
numbers,  but  the  ambiguity  of  English  complicated  this  task. 
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The  language  of  predicate  logic  gives  us  tools  on  top  of  propositional  logic 
and  set  theory  with  which  to  model  scenarios.  In  this  section  we  we  present 
a  few  examples. 


(^Example  4.12^) _ 

Recall  the  Carrollean  puzzle  from  Exercise  2.25,  where  we  are  given  the 
three  premises: 

All  babies  are  illogical. 

Nobody  is  despised  who  can  manage  a  crocodile. 

Illogical  persons  are  despised. 

from  which  we  are  to  deduce  that  no  baby  can  manage  a  crocodile.  Let  us 
introduce  the  following  predicates: 

B(x)  =  “x  is  a  baby” 

I(x )  s=  “x  is  illogical" 

D(x)  =  “x  is  despised” 

M(x)  =  “x  can  manage  a  crocodile” 
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Then  the  above  three  premises  translate  into  the  following  propositions: 

1.  =>  I(x)) 

2.  Vx(M(x )  =>  -nD(x))  or,  equivalently,  Va:(Z)(a;)  =>  ^M(x)) 

3.  \/x(I{x)  =>  D{x )) 

and  the  conclusion  translates  into  Vx(B(x)  =>  ~^M{xj). 

However,  for  any  x  such  that  B(x)  is  true  (if  a:  is  a  baby),  by  first  premise, 
I(x)  is  true  (x  is  illogical);  and  thus  by  the  third  premise,  D(x)  is  true  (x  is 
despised);  and  therefore  by  the  second  premise,  -> M(x )  is  true  (x  cannot 
manage  a  crocodile). 

Hence  the  conclusion  does  indeed  follow  from  the  premises. 

(^Exercise  4.12)  (Solution  on  page  426) _ 

Formalise  the  following  two  arguments  in  predicate  logic: 

1.  Everybody  loves  somebody. 

Therefore  somebody  is  loved  by  everybody. 

2.  Somebody  loves  everybody. 

Therefore  everybody  is  loved  by  somebody. 

In  each  case,  discuss  any  ambiguities  that  you  identify  in  the  English  state¬ 
ments,  but  use  what  you  consider  to  be  the  intended  interpretations. 

Are  these  arguments  valid? 


(^Example  4.13^) _ 

Figure  4.1  presents  an  example  Sudoku  puzzle  which  consists  of  a  9  x  9 
grid  with  numbers  entered  into  some  of  the  squares.  The  objective  is  to 
completely  fill  in  the  grid  so  that  each  column,  each  row,  and  each  of  the 
nine  3x3  blocks  contains  the  digits  from  1  to  9  exactly  once.  Properly  set, 
the  initial  numbers  will  allow  for  only  one  valid  solution. 

This  is  a  classic  logic-style  puzzle,  and  as  such  is  perfectly  suited  for 
modelling  in  predicate  logic.  If  you  struggle  with  solving  the  puzzle  given 
in  Figure  4.1,  an  Internet  search  engine  will  find  any  number  of  Web  sites 
which  will  solve  it  for  you;  and  the  means  by  which  these  Web  sites’  software 
does  this  will  inevitably  work  on  the  following  formal  representation  (or 
something  very  similar). 

We  start  by  defining  the  universe  of  discourse  to  be  the  interval  I  =  [1..9] 
of  integers  from  1  to  9.  This  reflects  the  fact  that  there  are: 


•  9  rows,  listed  from  top  to  bottom  as  row  1  through  to  row  9; 
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Figure  4.1:  A  Sudoku  puzzle. 


•  9  columns,  listed  from  left  to  right  as  column  1  through  to  column  9; 

•  9  blocks,  listed  from  left  to  right  and  top  to  bottom  as  block  1  through 
to  block  9; 

•  9  values,  1  through  9,  to  be  inserted  into  the  squares. 

We  then  define  the  following  predicate: 

V(i,j,k)  =  “square  ( i,j )  holds  the  value  k.” 

That  is,  the  number  k  is  in  the  square  located  in  row  i  and  column  j.  Thus, 
for  the  example  puzzle  in  Figure  4.1,  the  following  propositions  are  true: 


1/(1,  8,  6) 

1/(1,  9,  7) 

1/(2, 1,4) 

1/(2,  5,  9) 

1/(3, 1,3) 

1/(3, 4,  2) 

1/(3,  7,  9) 

1/(3,  8,  8) 

1/(4,  5,  2) 

me,  3) 

m,7,6) 

1/(5,  2,  2) 

1/(5,  5,  6) 

1/(5,  8,  5) 

1/(6,  3,1) 

1/(6, 4,  7) 

1/(6,  5, 4) 

1/(7,  2,  3) 

m,3,4) 

1/(7,  6,  7) 

m,9,  9) 

1/(8,  5,1) 

1/(8,  9,  6) 

1/(9, 1,9) 

1/(9,  2,  8) 

Next  we  define  the  following  predicate: 
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B(i,j,  b )  =  “square  ( i,j )  is  in  block  b." 

This  property  is  represented  by  the  following  nine  propositions  (one  for  each 
block): 


B{i ,  j,  1) 

<^> 

(*,  j) 

e 

[1-3] 

X 

[1-3] 

-B(i,  j,  2) 

<^> 

(*,  j) 

e 

[1-3] 

X 

[4-6] 

B[i,  y,  3) 

<^> 

(*,  j) 

e 

[1-3] 

X 

[7-9] 

-B(i,  3,  4) 

<^> 

(*,  j) 

e 

[4..6] 

X 

[1-3] 

-B(i,  j,  5) 

<^> 

(*,  j) 

e 

[4..6] 

X 

[4..6] 

B{i,  j,  6) 

<^> 

(*,  j) 

e 

[4..6] 

X 

[7-9] 

7) 

<^> 

(*,  j) 

e 

[7..9] 

X 

[1-3] 

B(t,  y,  8) 

<^> 

(*,  j) 

e 

[7-9] 

X 

[4..6] 

B{i,  y,  9) 

<^> 

(*,  j) 

e 

[7-9] 

X 

[7-9] 

Finally,  we 

are 

ready 

to 

represent  the  properties  satisfied  by  a  valid 

solution  to  the  puzzle. 

1.  Every  square  (i,j)  holds  exactly  one  value  k:  Vi  Vj  3!fc  V(i,  j,  k). 

2.  Every  row  i  contains  every  value  k:  ViVk  3y  V(i,  j,  k). 

3.  Every  column  j  contains  every  value  k :  Vy  Vfc  k). 

4.  Every  block  b  contains  every  value  k:  VbVk  3i3y  V(i,j,  k)  A  B(i,  y,  b ). 

All  that  is  required  now  is  to  deduce  truth  values  of  the  predicates  k ) 

which  satisfy  these  properties.  This  is  a  non-trivial  and  tedious  task  to  do 
by  hand,  but  is  the  sort  of  thing  that  computers  can  do  very  well  (and  very 
rapidly). 


Exercise  4.13 J  (Solution  on  page  426) _ 

Solve  the  Sudoku  puzzle  in  Figure  4.1. 
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1.  Let  V(x )  stand  for  the  predicate  “x  visits  his  parents  every  weekend”, 
where  the  domain  of  discourse  is  the  set  of  students  in  your  class. 
Express  each  of  the  following  quantifications  in  English: 


(a)  -ixV(x) 

(b)  VxV(x) 
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(c)  3.x^V(x) 

(d)  \/x^V(x) 

2.  Using  the  predicates: 

B(x )  =  “x  is  a  bee” 

F(x)  =  “x  is  a  flower” 

L{x,y )  =  “x  likes  y” 

write  each  of  the  following  statements  in  predicate  logic. 

(a)  All  bees  like  some  flowers. 

(b)  No  bee  likes  only  flowers. 

(c)  No  bee  hates  (that  is,  does  not  like)  all  flowers. 

3.  Express  the  negation  of  each  of  the  statements  in  the  previous  question, 
both  in  English  as  well  as  in  predicate  logic. 

4.  Let  T{s,c )  stand  for  the  predicate  “student  s  takes  course  c.”  Ex¬ 
press  the  following  statements  in  predicate  logic. 

(a)  “Alice  and  Bob  take  all  the  same  courses.  ” 

(b)  “Alice  and  Bob  do  not  take  any  courses  together.  ” 

5.  Express  the  following  properties  in  predicate  logic,  using  only  the  usual 
operations  of  addition  and  multiplication  as  well  as  the  less  than  rela¬ 
tion  <  between  numbers. 

(a)  a:  is  a  divisor  of  y. 

(b)  x  and  y  have  no  common  divisors. 

(c)  a:  is  a  prime  number. 

(d)  Every  integer  greater  than  one  has  a  unique  smallest  prime  divi¬ 
sor. 

(e)  (Goldbach’s  Conjecture)  Every  even  integer  greater  than  two 
can  be  written  as  the  sum  of  two  primes. 

6.  Express  in  English  what  each  of  the  following  propositions  is  saying 
about  the  set  of  real  numbers  R,  and  determine  whether  they  are  true 
or  false. 

(a)  Va;  3 y  (x+y  =  x). 

(b)  3y  \/x  ( x+y  =  x ). 

(c)  Va;  3?/  ( x2  =y). 

(d)  Vt/  3a;  ( x2  =y). 

(e)  Va;Va/  (x<y  V  y<x). 
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7.  Express  the  following  in  predicate  logic. 

(a)  At  least  three  items  have  property  P. 

(b)  At  most  3  items  have  property  P. 

(c)  Exactly  three  items  have  property  P. 

8.  A  particular  jazz  standard  recorded  by  Doris  Day  has  the  following 
title  and  lyrics: 

Everybody  loves  my  baby 

But  my  baby  don’t  love  nobody  but  me 

Express  the  above  in  predicate  logic.  What  can  you  deduce  from  these 
two  statements  about  who  “my  baby”  is? 

9.  Samuel  Goldwyn,  on  being  told  by  a  friend  told  him  that  he  had 
named  his  son  Sam,  exclaimed,  “Why  did  you  do  that?  Every  Tom, 
Dick  and  Harry  is  named  Sam!”  Assuming  Goldwyn  was  right,  and 
assuming  he  was  restricting  his  attention  to  first  names,  how  many 
Sams,  Dicks  and  Harrys  are  there?  Formulate  your  answer  in  predicate 
logic,  including  the  assertion  that  every  person  has  exactly  one  first 
name. 

10.  Lewis  Carroll  made  the  following  argument. 

Everybody  who  is  sane  can  do  logic. 

No  lunatics  are  fit  to  serve  on  a  jury. 

None  of  your  sons  can  do  logic. 

Therefore,  none  of  your  sons  is  fit  to  serve  on  a  jury. 

Formulate  the  four  claims  in  predicate  logic.  Do  you  consider  this  a 
valid  argument? 

11.  Lewis  Carroll  also  made  the  following  three  claims. 

No  professor  is  ignorant. 

All  ignorant  people  are  vain. 

No  professor  is  vain. 

Formulate  these  three  claims  in  predicate  logic.  Do  any  of  them  follow 
from  the  other  two? 

12.  What  is  wrong  with  the  following  argument: 

A  ham  sandwich  is  better  than  nothing. 

Nothing  is  better  than  eternal  happiness. 

Therefore,  a  ham  sandwich  is  better  than  eternal  happiness. 


Chapter  5 

~k  Proof  Strategies 


You  want  proof?  I’ll  give  you  proof! 

-  Sidney  Harris. 

So  far,  we  have  concentrated  on  developing  formal  languages  for  rigorously 
and  unambiguously  expressing  properties  of  systems,  namely  the  languages 
of  propositional  logic,  predicate  logic,  sets  and  Boolean  algebras.  In  the  case 
of  propositional  logic  we  have  used  truth  tables  to  determine  the  validity 
of  logical  arguments.  We  have  also  learned  what  it  means  for  statements 
of  predicate  logic  to  be  true  or  false,  but  we  have  not  yet  seen  a  procedure 
for  determining  truth  or  falsity.  This  is  perhaps  not  too  surprising,  as  pred¬ 
icates  can  range  over  infinite  universes  of  discourse,  hence  infinitely  many 
candidates  potentially  need  to  be  inspected  to  test  statements  such  as  “This 
program  will  terminate  (with  the  correct  result)  at  some  point  in  time.” 

A  proof  of  a  (true)  statement  is  a  demonstration  of  its  validity  which 
contains  sufficient  detail  to  convince  someone  that  the  statement  is  true. 
Statements  which  are  provable  are  called  theorems.  We  encountered  formal 
proofs  already  in  Chapter  3,  where  we  derived  the  truth  of  various  theorems 
of  Boolean  algebra;  each  such  derivation  ended  with  the  symbol  □  indicating 
that  the  truth  of  the  theorem  had  been  established. 

Proofs  allow  us  to  reason  formally  about  properties  of  systems,  so  that 
(ultimately)  we  can  provide  convincing  and  irrefutable  evidence  of  their  cor¬ 
rectness.  We  have  already  explored  some  basic  proof  techniques,  for  instance 
reasoning  with  logical  equivalences  in  propositional  logic  and  reasoning  equa- 
tionally  with  Boolean  algebras.  However,  thus  far  we  have  asked  no  more  of 
our  reader  than  to  use  common  sense  to  follow  our  reasoning. 

Proofs  of  theorems  often  require  creativity  and  inspiration.  Furthermore, 
there  will  always  be  many  different  ways  to  prove  a  given  theorem,  and  any 
valid  proof  of  a  given  theorem  will  be  just  as  correct  a  proof  as  any  other. 
However,  some  proofs  will  be  more  elegant  and  more  easily  grasped  than 
others.  The  mathematician  Paul  Erdos  often  referred  to  “The  Book”  in 
which  God  keeps  the  most  elegant  proof  of  each  mathematical  theorem,  and 
noted  that  “You  don’t  have  to  believe  in  God,  but  you  should  believe  in 
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The  Book.” 

Elegance  aside,  all  formal  proofs  follow  certain  patterns  that  can  be 
learned  like  the  rules  of  chess.  Different  proof  strategies  can  be  applied, 
depending  only  on  the  form  of  the  property  being  considered.  Once  these 
strategies  are  learned,  proofs  can  be  easily  -  even  mechanically  -  constructed 
and  checked.  In  this  chapter,  we  develop  such  proof  strategies  that  will  allow 
us  to  verify  (or  indeed  falsify)  system  properties  from  a  systematic  point  of 
view,  relieving  us  of  the  need  for  too  much  Eureka!- invoking  inspiration. 


(5^T)  A  First  Example 

It  is  obvious  -  by  drawing  a  Venn  diagram  -  that  the  union  A  U  B  of  two 
sets  A  and  B  contains  both  A  and  B  as  subsets: 


A  C  A  U  B  and  B  C  A  U  B. 


But  A  U  B  is  a  very  special  superset  of  A  and  B:  it  consists  of  precisely  the 
elements  of  A  and  the  elements  of  B  -  no  more  and  no  less  -  and  is  therefore 
the  least  superset  of  both  A  and  B.  In  other  words,  any  set  C  which  is  a 
superset  of  both  A  and  of  B  is  also  a  superset  of  A  u  B: 


Although  this  fact  should  be  intuitively  clear,  its  validity  deserves  a 
formal  proof  such  as  the  following. 

(^Theorem  5.1?) 

Let  A,  B  and  C  be  sets.  Then 

ACC  A  BCC^AuBCC. 


Proof:  Assume  that  ACC  and  B  C  C;  we  must  show  that  Au  B  C  C. 
To  do  this,  we  expand  the  definition  of  the  set  inclusion  Au  B  C  C: 

every  element  of  A  u  B  must  also  be  in  C. 

So  we  pick  an  arbitrary  element  x  e  AuB  and  we  show  that  x  e  C.  Noting 
that  x  e  A  U  B  is  the  same  as  x  e  A  V  x  e  B,  we  proceed  by  case  analysis 
on  whether  x  G  A  or  x  G  B. 
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1.  If  x  e  A,  then  x  e  C,  since  we  assumed  that  ACC. 

2.  If  x  e  B,  then  again  x  e  C,  since  we  assumed  that  B  C  C. 

In  each  case,  x  e  C  follows  from  the  assumptions.  □ 

At  first  sight  you  might  find  this  proof  perhaps  more  difficult  to  un¬ 
derstand  and  less  revealing  than,  for  instance,  a  Venn  diagram.  However, 
in  a  few  steps,  it  can  be  completely  reduced  to  some  basic  proof  strategies 
for  predicate  logic  and  some  basic  principles  about  sets.  Everyone  who  has 
learned  these  strategies  and  principles  can  then  easily  check  this  proof,  and, 
in  fact,  even  machines  can  do  that  for  you. 

So  let  us  take  a  quick  initial  look  at  some  of  the  proof  strategies  that  occur 
in  this  argument.  They  are  based  on  logical  principles  of  reasoning  with 
propositional  connectives  and  quantifiers.  They  deal  with  these  connectives 
and  quantifiers  in  two  essentially  different  ways. 

First,  in  order  to  prove  the  implication 

ACC  R  B  CC  ->  AuBCC, 

we  have  assumed  that  A  C  C  A  B  C  C  and  proved  that  AcB  C  C  from  this 
assumption.  The  underlying  proof  strategy  allows  us  to  prove  an  arbitrary 
implication  P  =>  Q  by  assuming  P  and  proving  Q  from  this  assumption.  In 
a  similar  fashion,  instead  of  proving 

Vz  (x  e  A  u  B  =>  x  e  c), 

we  have  proved  x  6  A  U  B  =>  x  G  C  for  an  arbitrary  x  taken  from  the 
universe  of  discourse. 

One  way  of  understanding  these  proof  strategies  is  that  they  decompose 
a  proof  goal,  replacing  it  with  a  simpler  one  from  which  the  original  goal 
follows  more  or  less  automatically.  These  strategies  narrow  the  distance  be¬ 
tween  the  assumptions  and  the  goal  from  the  goal  side,  hence  in  a  bottom 
up  way.  Another  way  of  understanding  these  strategies  is  to  observe  that 
they  introduce  a  logical  connective  or  quantifier  into  a  proof.  They  can 
therefore  be  characterised  as  introduction  strategies  for  connectives  or 
quantifiers.  The  strategies  mentioned  above,  for  instance,  introduce  impli¬ 
cation  and  universal  quantification,  respectively. 

A  second  kind  of  strategy  allows  us  to  use  complex  assumptions  or  in¬ 
termediate  proof  results  (which  can  also  be  seen  as  assumptions)  in  proofs. 
In  the  above  proof,  for  instance,  we  have  used  a  strategy  that  allowed  us 
to  decompose  the  assumption  ACC  A  B  C  C  into  two  separate  as¬ 
sumptions  ACC  and  B  C  C.  Also,  to  prove  x  g  C  from  the  assumption 
x  £  A  V  x  e  B,  we  have  used  a  case  analysis  strategy  and  proved  x  e  C 
first  from  x  e  A  and  then  from  x  G  B.  The  underlying  proof  strategy  allows 
us  to  prove  a  goal  R  from  a  disjunction  P  V  Q  by  case  analysis,  that  is,  by 
proving  R  from  the  assumption  P  and  from  the  assumption  Q  separately. 
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This  second  type  of  strategy  can  be  understood  as  narrowing  the  distance 
between  the  assumptions  and  the  goal  from  the  assumptions  side,  hence  in 
a  top  down  way.  They  eliminate  logical  connectives  or  quantifiers  and  can 
therefore  be  characterised  as  elimination  strategies. 

When  faced  with  the  prospect  of  proving  a  theorem,  a  sensible  approach 
would  be  to: 

1.  write  out  any  assumptions,  and  previously-established  facts  that  you 
suspect  may  be  relevant,  at  the  top  of  a  page; 

2.  write  out  the  statement  which  you  wish  to  prove  at  the  bottom  of  the 
page; 

3.  repeatedly  apply  elimination  strategies  to  the  statements  at  the  top, 
and  introduction  strategies  to  the  statements  at  the  bottom,  and  look 
for  how  to  make  the  logical  argument  meet  in  the  middle. 

With  this  in  mind,  we  will  present  basic  introduction  and  elimination  strate¬ 
gies  for  each  of  the  propositional  connectives  and  quantifiers,  and  depict 
these  as  proof  outlines  with  “holes”  in  the  middle  that  need  to  be  filled  in. 
The  justification  behind  each  such  proof  outline  will  be  made  evident. 


(^Exercise  5.2)  (Solution  on  page  427) _ 

Let  A,  B  and  C  be  sets.  Prove  the  converse  of  Theorem  5.1,  that 

Ac  B  CC  ^  ACC  A  B  CC. 


(5^2)  Proof  Strategies  for  Implication 

A  proof  of  a  theorem  consists  of  a  sequence  of  statements,  each  either  being 
assumed  or  known  to  be  true,  or  logically  inferred  from  (i.e.,  implied  by) 
earlier  statements  appearing  in  the  proof.  It  is  sensible,  therefore,  to  start 
by  considering  proof  strategies  for  implication. 

In  our  introductory  example  we  proved  the  theorem 

ACC  A  B  CC  ^  Ac  B  CC. 

by  assuming  that  ACC  A  B  CC  and  showing  from  this  that  Ac  B  C  C. 
This  idea  can  be  generalised  to  the  following  proof  strategy  for  implication. 
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This  is  an  introduction  strategy  for  implication,  as  it  gives  a  method  for 
introducing  a  statement  of  the  form  P  =f>  Q  into  a  proof. 


Consider  the  fact  that  the  average  of  two  different  numbers  lies  somewhere 
strictly  between  the  two.  For  example,  the  average  of  the  two  numbers 
13  and  25  is  19,  which  lies  strictly  between  the  two  given  numbers  13  and 
25.  This  general  fact  is  intuitively  obvious.  However,  once  it  is  rendered 
in  precise  mathematical  terms,  it  becomes  something  that  is  nonetheless 
deserving  of  a  proof. 

As  a  mathematical  statement,  the  above  fact  becomes: 

If  a  <  b  then  a  <  a  if  -  and  a  if  ^  <  b. 

More  precisely,  this  statement  is  of  the  form 
P  =#  Q 
where 

P  =  a  <  b,  and 

Q  =  a<  A  <  b. 

Here  we  prove  one  half  of  this  result: 

If  a  <  b  then  <  b. 

Proof:  Assume  that  a  <  b. 

Then,  by  adding  b  to  both  sides,  we  get  that  a+b  <  b+b. 

Thus,  by  dividing  both  sides  by  2,  we  get  that  a  ^  ^  'jj'  ^ . 

Since  ^  ^  ^  =  b,  we  get  that  a  if  ^  <  b. 

Therefore,  if  a  <  b  then  a  <  b.  □ 

The  introduction  strategy  for  implication  is  so  fundamental  that  one 
usually  just  assumes  P  and  proves  Q  without  even  mentioning  that  this 
yields  a  proof  of  P  =>  Q. 
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Prove  that  the  product  of  two  even  integers  is  an  even  integer. 
Proof:  Assume  that  a  and  b  are  even  integers. 


An  even  integer  is  twice  an  integer. 

Thus  a  =  2p  and  b  —  2q  for  some  integers  p  and  q. 

Hence  ab  =  ( 2p)(2q )  =  4 pq 

=  2k  for  the  integer  k  =  2 pq. 

Therefore,  ab  is  an  even  integer.  □ 


Exercise  5.3)  (Solution  on  page  427) 


Prove  that  the  product  of  two  odd  integers  is  an  odd  integer. 


There  is  another  introduction  strategy  for  implication  which  may  be 
more  natural  to  apply  on  occasion.  We  can  assume  that  Q  is  false  and  prove 
that,  under  this  assumption,  P  must  also  be  false.  The  form  of  such  a  proof 
would  thus  be  as  follows. 


A  proof  which  employs  this  strategy  is  referred  to  as  a  proof  by  contrapo¬ 
sition. 


Prove,  by  contraposition,  the  result  from  Example  5.2  that,  for  any  two  real 
numbers  a  and  b,  if  a  <  b  then  <  b. 


Proof:  Suppose  that  a  ^  ^  >  b. 

Then,  by  multiplying  both  sides  by  2,  we  get  that  a  +  b  >  2b. 
Thus,  by  subtracting  b  from  both  sides,  we  get  that  a  >  b. 
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Therefore,  if  a  <  b  then  a  2  ^ 


□ 


Corresponding  to  the  above  two  introduction  strategies,  there  are  two 
elimination  strategies  for  implication  which  allow  us  to  draw  inferences  from 
statements  in  a  proof  that  involve  implication.  These  strategies  are  as  fol¬ 
lows. 

1.  If  P  =>  Q  is  true  and  P  is  true,  then  Q  is  true. 

A  use  of  this  proof  strategy  is  referred  to  as  modus  ponens,  and  takes 
the  following  form. 


r  :  'N 

proof  of  P  =>  Q 

v  :  J 


f  i  ^ 

proof  of  P 

v  :  J 

\  Therefore,  Q.  □  ) 


2.  If  P  =>  Q  is  true  and  ~^Q  is  true,  then  -1 P  must  be  true. 

A  use  of  this  proof  strategy  is  referred  to  as  modus  tollens,  and  takes 
the  following  form. 


Indeed,  we  have  already  seen  these  proof  principles  in  action,  particularly  ex¬ 
tensively  in  the  solution  to  the  Amos  Judd  puzzle  of  Exercise  1.14  (page  33). 
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Prove  that  if  a  6  A  and  A  C  B  then  a  6  B. 
Proof:  Assume  that  a  e  A,  and  that  A  C  B. 


By  definition,  AC  B  means  that  x  e  A  =f>  x  e  B  for  any  x. 

In  particular,  a  £  A  =f>  a  E  B. 

Thus,  by  modus  ponens,  a  £  B.  □ 


Exercise  5.5  J  (Solution  on  page  427) 


Prove  that  the  number  9  839  853  is  divisible  by  3.  (You  may  use  the  fact 
that  a  number  is  divisible  by  3  if  the  sum  of  its  digits  is  divisible  by  3.) 


(53)  Proof  Strategies  for  Negation 

The  main  approaches  to  proving  a  property  of  the  form  -> P  is  to  assume 
that  P  is  true  and  to  infer  from  this  a  contradiction.  By  this,  we  mean 
that  both  some  property  Q  and  its  negation  -1 Q  can  be  inferred  from  our 
assumption  P ;  as  such  a  contradiction  is  impossible,  the  assumption  from 
which  it  was  inferred  must  be  invalid.  The  form  of  such  a  proof  would  thus 
be  as  follows. 


This  is  the  standard  negation  introduction  strategy.  The  associated  negation 
elimination  strategy  is  nearly  identical,  allowing  positive  results  to  be  proven 
by  contradiction.  It  takes  the  following  form. 
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Assume  -i P. 
Then 


proof  of  contradiction 


Therefore,  P. 


□ 


A  proof  which  employs  either  of  these  strategies  is  referred  to  as  a  proof  by 
contradiction  or,  more  fancily,  as  reductio  ad  absurdum. 

Our  first  example  of  a  proof  by  contradiction  is  over  2000  years  old  and 
is  attributed  to  the  school  of  Pythagoras. 


Example 


Prove  that  is  irrational;  that  is,  g  Q. 


Proof.  Suppose  to  the  contrary  that  -J2  £  Q;  specifically,  suppose  that 
\/2  =  ^  where  a  and  b  are  positive  integers  and  ^  is  a  fraction  in 
lowest  form;  in  particular,  a  and  b  are  not  both  even. 

2 

Then  squaring  both  sides  gives  us  that  2  =  p-,  and  then  multiplying 
both  sides  by  b2  gives  us  that  2 b2  —  a2. 

Hence  a  must  be  even  (since,  by  Exercise  5.3,  if  a  were  odd  then  a2 
would  also  be  odd);  that  is,  a  =  2c  for  some  integer  c. 

As  a  and  b  are  not  both  even,  b  must  be  odd. 

But  then  2 b2  —  a2  =  (2c)2  =  4c2,  so  b2  =  2c2,  which  means  that  b  must 
be  even,  contradicting  our  earlier  observation  that  b  must  be  odd. 

This  must  mean  that  our  assumption  that  \/2  is  rational  must  be  in¬ 
valid;  that  is,  y/2  must  in  fact  be  irrational.  □ 


Another  famous  example  of  a  proof  by  contradiction  that  is  also  over 
2000  years  old,  this  time  due  to  Euclid,  is  the  following  argument  that  there 
are  infinitely  many  prime  numbers.  The  proof  relies  on  the  Fundamental 
Theorem  of  Arithmetic  -  also  proved  by  Euclid  and  which  we  prove  in 
Exercise  9.9,  page  235  -  which  states  that  every  positive  integer  can  be 
expressed  as  a  product  of  prime  numbers;  in  particular,  every  such  number 
is  divisible  by  some  prime  number. 


140  Proof  Strategies 


Prove  that  there  are  infinitely  many  prime  numbers. 

Proof.  Suppose  to  the  contrary  that  there  are  finitely  many  prime  numbers, 
which  we  may  list  as  {plt  p2,  p3,  pk  }. 


Let  n  =  (pi  x  p2  x  p3  x  ■  ■  •  x  pk)  +  1. 

This  number  cannot  be  prime,  as  it  is  clearly  larger  than  every 
one  of  the  k  prime  numbers  p2  through  pk. 

Thus,  by  the  Fundamental  Theorem  of  Arithmetic,  some  prime 
number  pt  must  divide  evenly  into  n. 

However  this  is  impossible,  as  dividing  n  by  ps  clearly  leaves  a 
remainder  of  1,  and  hence  pt  does  not  divide  evenly  into  p. 

Therefore,  our  assumption  that  there  are  finitely  many  prime  numbers 
must  be  invalid;  that  is,  there  must  in  fact  be  infinitely  many  prime 
numbers.  □ 


Suppose  that  AnC  C  B  and  that  a  e  C.  Prove  that  a  g  A  \  B. 


As  always,  before  blindly  starting  a  proof,  you  should  try  to  get  a  good 
impression  in  your  mind  as  to  what  it  is  you  are  trying  to  prove.  If  possible, 
this  is  best  done  by  drawing  a  picture,  which  in  this  case  means  a  Venn 
diagram: 


Here  we  have  depicted  three  sets  A,  B  and  C  which  satisfy  the  premise  of 
the  proposition  that  we  wish  to  prove:  that  AnC  C  B.  From  this  we  need 
to  infer  that  any  element  a  £  C  (i.e. ,  which  lies  in  the  light  gray  area)  will 
not  be  in  A  \  B  (i.e.,  cannot  lie  in  the  dark  gray  area).  This  seems  obvious 
in  the  picture,  but  a  rigorous  argument  is  still  demanded.  Fortunately,  now 
that  we  have  a  clear  picture  in  our  mind,  a  rigorous  proof  seems  trivial. 

Proof.  Assume  that  the  premises  of  the  proposition  are  true,  that  AnC  C  B 
and  that  a  e  C.  We  shall  show  that  assuming  that  a  e  A  \  B  leads  to 
a  contradiction. 

Suppose  that  a  e  A  \  B;  that  is,  that  a  e  A  but  that  a  £  B. 


Proof  Strategies  for  Negation  141 


Since  a  E  A  and  a  E  C  (from  the  premise  of  the  proposition),  we 
have  that  a  E  A  n  C. 

But  since  inC  C  B  (again  from  the  premise  of  the  proposition), 
from  a  E  A  n  C  we  get  that  a  E  B,  contradicting  a  £  B. 

Therefore,  we  cannot  have  a  E  A\B;  that  is,  we  have  a  £  A  \  B.  □ 


As  usual,  there  are  various  ways  that  this  proposition  can  be  proven,  all  of 
which  being  equally  valid.  The  following  is  provided  as  an  example. 

A  Different  Proof.  Assume  that  the  premises  of  the  proposition  are  true, 
that  AnC  C  B  and  that  a  E  C. 

As  a  E  A  \  B  if,  and  only  if,  a  g  A  and  a  £  B,  we  shall  show  that 
a  ^  A  \  B  by  showing  that  we  cannot  have  both  a  E  A  and  a  ^  B; 
that  is,  if  we  assume  that  a  E  A  then  we  can  deduce  that  a  E  B. 

Suppose  then  that  a  E  A. 

Since  a  E  C  (from  the  premise  of  the  proposition),  we  have  that 
a  E  AnC. 

But  then  since  AnC  C  B  (again  from  the  premise  of  the  propo¬ 
sition),  we  have  that  a  E  B. 

Therefore,  we  cannot  have  both  a  E  A  and  a  ^  B;  that  is  to  say,  we 
must  have  a  ^  A  \  B.  □ 


Example  5.9 


Assume  that  a  and  b  are  positive  real  numbers. 

Prove  that  either  a  <  -fab  or  K  Vab. 

Proof.  Suppose  to  the  contrary  that  a  >  \fab  and  b  >  \fab. 

/  , - \  2 

Then  ab  >  (yab  )  =  ab,  which  is  impossible. 

Therefore,  either  a  <  \fab  or  K  \fab.  □ 


(^Exercise  5.9^)  (Solution  on  page  427) 


Prove  that  there  is  no  such  thing  as  the  smallest  positive  rational  number. 
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(^Exercise  5.10^)  (Solution  on  page  428) 

Prove  that  every  integer  greater  than  1  can  be  written  as  a  product  of  prime 
numbers. 

(Note  that  a  prime  number  is  the  trivial  product  of  one  prime  number.) 
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There  is  very  little  interesting  or  needed  to  say  about  dealing  with  conjunc¬ 
tions  in  proofs.  To  prove  a  property  of  the  form  P  A  Q,  we  simply  prove  P 
and  Q  separately.  The  form  of  such  a  proof  will  look  as  follows. 


proof  of  P 


f-'  |  ^ 

proof  of  Q 

v  :  J 

\  Therefore,  P  aQ.  □  ) 


This  is  the  basic  introduction  strategy  for  conjunction.  The  basic  elimina¬ 
tion  strategy  is  equally  straightforward:  we  may  infer  the  truth  of  one  of 
the  conjuncts  of  an  established  conjunction.  The  form  of  such  a  proof  will 
look  like  one  of  the  following  following. 


These  will  rarely  be  used  in  isolation,  and  their  use  inevitably  comes  natu¬ 
rally.  As  such,  the  following  examples  -  while  instructive  -  are  somewhat 
contrived  and  superfluous. 


Example  5.10 


Prove  that  if  x  e  A  and  x  E  B  then  x  E  An  B. 
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Proof:  Assume  that  x  E  A  and  that  x  E  B. 

By  the  conjunction  introduction  strategy,  we  can  infer  from  this  that 
x  E  A  A  x  E  B,  which  by  definition  means  that  x  E  An  B.  □ 


Prove  that  if  x  E  A  n  B  then  x  E  A  and  x  E  B. 


Proof:  Assume  that  x  E  An  B 

By  definition  this  means  that  x  E  A  A  x  E  B. 

By  the  conjunction  elimination  strategy,  we  can  infer  from  this  both 
that  x  E  A  and  that  x  E  B.  □ 

An  equivalence  P  4=>  Q  between  properties  P  and  Q  simply  represents 
the  fact  that  each  property  implies  the  other;  it  is  true  if,  and  only  if,  P  =4-  Q 
and  Q  =>  P.  As  such,  proof  strategies  for  equivalence  are  naturally  based 
on  those  for  conjunction.  To  prove  Peft  we  simply  prove  P  =>  Q  and 
Q  =>  P  separately.  The  form  of  such  a  proof  will  look  as  follows. 


This  is  the  basic  introduction  rule  for  equivalence.  The  basic  elimination 
strategy  is  to  infer  the  implication  in  one  direction  or  the  other  from  an 
established  equivalence.  The  form  of  such  a  proof  will  look  as  follows. 
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(5^5)  Proof  Strategies  for  Disjunction 

To  prove  that  a  disjunctive  property  P  V  Q,  it  suffices  to  prove  one  or  the 
other  of  the  disjuncts.  The  basic  introduction  strategy  for  disjunction  is 
thus  of  the  following  forms. 


The  above  is  rather  weak,  though.  It  might  not  be  the  case  that  one 
of  P  or  Q  always  holds;  rather,  which  holds  might  depend  on  some  other 
factors.  That  is,  it  might  be  that  P  holds  whenever  some  property  R  holds, 
and  that  Q  holds  when  the  property  R  does  not  hold.  In  this  case  the  proof 
of  P  V  Q  needs  to  be  broken  into  cases.  The  relevant  introduction  strategy 
would  then  be  of  the  following  proof  form. 


Prove  that  for  any  integer  n,  the  remainder  of  n 2  when  divided  by  4  is  either 
0  or  1. 


Proof:  Either  n  is  even  or  it  is  odd. 
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•  If  n  is  even,  then  n  —  2k  for  some  integer  k,  and 

n2  =  (2  kf  =  4  k2 

which  clearly  has  a  remainder  of  0  when  divided  by  4. 

•  If  n  is  odd,  then  n  =  2k  +  1  for  some  integer  k,  and 

n2  =  (2k  +  l)2  =  4fc2  +  Ak  +  1  =  4(fc2  +  k)  +  1 
which  clearly  has  a  remainder  of  1  when  divided  by  4. 

Thus  the  remainder  of  n2  when  divided  by  4  is  either  0  or  1.  □ 


A  special  case  of  the  above  strategy  is  to  take  the  property  R  to  be  P 
itself.  In  this  case,  there  would  be  no  effort  needed  to  infer  P  from  the 
assumption  P,  so  the  form  of  the  proof  would  be  as  follows. 


Assume  -i P. 
Then 


proof  of  Q 


Therefore,  P  V  Q.  □ 


(^Example  5.13^) _ 

Prove  that  if  a:  is  a  real  number  with  x2  >  x  then  either  x  <  0  or  x  >  1. 

Proof:  Assume  as  given  that  x2  >  x.  Clearly  this  means  that  x  0. 

If  it  is  not  the  case  that  x  <  0,  then  x  >  0,  and  we  can  divide  each  side 
of  the  given  inequality  a:2  >  x  by  x  to  deduce  that  x  >  1. 

Hence,  either  i  <  0  or  i  >  1.  □ 


(^Exercise  5.13^)  (Solution  on  page  428) 


Prove  that  if  the  product  of  two  integers  is  even,  then  one  of  these  two 
integers  is  itself  even. 
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Prove  that  if  A  C  B  then  either  x  g  A  or  x  e  B. 


The  elimination  strategy  for  disjunction  is  more  interesting.  If  we  have 
as  given  a  property  P  V  Q,  we  can  prove  that  a  further  property  R  holds 
by  breaking  the  proof  into  cases;  that  is,  we  show  that  P  =>  R  and  Q  =>  R. 
This  being  the  case,  regardless  of  which  of  P  or  Q  is  true,  R  must  be  true. 
The  form  of  this  elimination  strategy  is  thus  as  follows. 


Prove  that  An(fluC)  C  (A  n  B)  U  C. 

Proof.  Let  x  e  A  n  (B  U  C)  =  (A  n  B)  U  (A  n  C). 


Then  either  x  6  A  n  B,  in  which  case  x  e  (A  n  B)  U  C\ 
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or  x  e  (AnC),  in  which  case  igCso  again  x  e  (AnB)uC.  □ 


(^Example  5.15^) _ 

Prove  that  if  \x  —  3|  >  3  then  x2  >  6x. 

Proof.  If  \x  —  3|  >  3  then  either  x  >  6,  in  which  case  x2  >  6a;; 

or  x  <  0,  in  which  case  a;2  >  0  >  6a:.  □ 


(^Exercise  5.15)  (Solution  on  page  429) 


Prove  the  triangle  inequality:  For  real  numbers  a  and  6,  |a  +  6|  <  |a|  +  |6|. 


Exercise  5.16J  (Solution  on  page  429) _ 

Prove  that  if  n  is  an  integer,  then  the  final  (units)  digit  of  n2  must  be  either 
0,  1,  4,  5,  6  or  9;  that  is,  n2  cannot  end  with  a  2,  3,  7  or  8. 

Exercise  5.17()  (Solution  on  page  430) 

What  is  wrong  with  the  following  proof? 

Fact:  If  x+y  =  12  then  x  ^  7  and  y  yf  8. 

Proof:  Assume  that  the  conclusion  is  false,  that  is,  that  it  is  not  the 
case  that  x  yt  7  and  y  ^  8. 

Then  x  =  7  and  y  =  8, 

Hence  if  x+y  =  12  then  x  7  and  y  ^  8.  □ 
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5.6.1  Universal  Quantification 

A  universal  quantification  \/x  P{x)  represents  a  potentially-infinite  conjunc¬ 
tion,  asserting  that  P(a )  is  true  for  every  value  a  of  the  universe  of  discourse 
for  the  predicate  P.  As  such,  we  look  at  how  to  generalise  the  proof  strate¬ 
gies  for  conjunction. 

To  prove  a  property  of  the  form  VxP(x~),  let  a  stand  for  an  arbitrary 
object,  and  prove  P(a).  The  form  of  such  a  proof  would  thus  be  as  follows. 
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/  Let  a  be  arbitrary.  \ 

f  :  'N 

proof  of  P{a ) 

V  !  ) 

\  Therefore,  Va;  P(x).  □  J 


As  long  as  we  make  no  assumptions  about  a  in  the  proof  of  P(a),  then  this 
proof  will  be  valid  for  whatever  choice  of  a  we  make.  That  is,  we  will  have 
shown  that  P(x )  must  be  true  for  every  x  (that  is,  for  any  and  every  choice 
of  value  a  for  x).  It  should  be  apparent  how  this  introduction  strategy 
generalises  that  for  conjunction. 

Note  that  we  have  already  been  tacitly  using  this  strategy.  For  instance, 
in  Example  5.9  we  proved  a  result  held  for  all  positive  real  numbers  a  and  b, 
by  assuming  as  given  arbitrary  values  for  a  and  b.  Usually  this  is  fine  -  we 
generally  don’t  have  to  think  twice  about  taking  arbitrary  values  as  given. 
However,  we  do  sometimes  have  to  be  more  careful  with  introducing  values. 

We  can  next  look  to  the  elimination  strategy  for  conjunction  to  derive 
a  straightforward  generalisation  which  tells  us  how  to  use  a  universal  quan¬ 
tification  within  a  proof.  If  we  have  ascertained  that  Va:  P(x)  is  true  and  a 
is  an  element  in  the  universe  of  discourse  for  the  predicate  P,  then  we  can 
immediately  infer  that  P(a)  is  true.  The  form  of  such  a  proof  will  look  as 
follows. 


proof  of  Va;  P(x) 


\  Therefore,  P(a). 


Example  5.17J 

Prove  that  if  An  B  =  A  then  AC  B. 

Proof.  Assume  that  AnB  =  B.  We  need  to  demonstrate  that  AC  B,  that 
is,  that  for  any  x,  if  x  e  A  then  x  E  B: 

Va;  (a;  6  A  -•>  x  £  B). 


To  this  end,  let  a  be  an  arbitrary  value. 


Proof  Strategies  for  Quantifiers  149 


To  show  that  a  E  A  =>  a  E  B,  we  assume  that  a  E  A  and  prove 
from  this  assumption  that  a  E  B. 

Assume  then  that  a  E  A. 

Since  AnB  =  A  (from  the  premise  of  the  proposition),  this  means 
that  a  E  An  B. 

But  this  means  that  a  E  A  and  a  E  B;  in  particular,  that  a  E  B. 
Therefore,  Vx(x  E  A  =f>  x  E  B );  that  is,  AC  B.  □ 


Example  5.18J 

Prove  that  Va;  (P(ai)  A  Q(x))  o  Vx  P(x)  A  VxQ(x). 

Proof.  (=f>)  Suppose  Va:  (P(x)  A  Q(x)j,  and  let  a  be  an  arbitrary  value. 

Then  P(a )  A  Q(a),  so  P(a )  and  Q(a). 

Since  a  is  arbitrary,  we  can  infer  that  Vx  P(x)  and  VxQ(x ); 
that  is,  VxP(x)  A  VxQ(x). 

(4=)  Suppose  Vx  P(x)  A  VxQ(x),  and  let  a  be  an  arbitrary  value. 
Then  P(a )  and  Q(a),  so  P(a )  A  Q(a). 

Since  a  is  arbitrary,  we  can  infer  that  Vx  (P(a:)  A  Q(xf).  □ 


(^Exercise  5.18^)  (Solution  on  page  430) 


Prove  that  if  A  and  B\C  are  disjoint  then  An  B  C  C. 


5.6.2  Existential  Quantification 

An  existential  quantification  3x  P(x)  represents  a  potentially-infinite  dis¬ 
junction,  asserting  that  P{a )  is  true  for  some  value  a  of  the  universe  of 
discourse  for  the  predicate  P.  As  such,  we  look  at  how  to  generalise  the 
proof  strategies  for  disjunction. 

To  prove  a  property  of  the  form  3a:  P(x),  we  need  only  find  a  value  a  for 
which  P(a )  holds,  and  prove  P(a).  The  form  of  such  a  proof  would  thus  be 
as  follows. 
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Note  the  difference  between  this  introduction  strategy  and  the  introduction 
strategy  for  Va;  P(x).  To  prove  \/x  P(x)  you  need  to  prove  that  P(a )  holds 
for  an  arbitrary  value  a  without  making  any  assumptions  about  a.  To  prove 
3a;  P(x)  you  need  to  prove  that  P(a )  holds  for  a  single  chosen  value  of  a. 

We  next  look  to  the  elimination  strategy  for  disjunction  to  derive  a  gen¬ 
eralisation  which  tells  us  how  to  use  an  existential  quantification  within  a 
proof.  If  we  have  ascertained  that  3a;  P(x)  is  true,  and  if  some  property  R 
holds  under  the  assumption  that  P(a )  holds  regardless  of  the  specific  value 
a  of  the  universe  of  discourse,  then  we  can  infer  that  R  is  true.  The  form  of 
such  a  proof  will  look  as  follows. 


v—2 

Prove  that,  if  x  yf  1,  then  =  x  for  some  y. 

Proof.  Let  y  =  (noting  that,  since  x  y6  1,  1— a;  yf  0,  and  so  we  are 
not  inadvertently  dividing  by  0  in  defining  y). 


Then  y— 2 


x+2  —  2(1— a;) 


and  y+1 
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□ 


The  difficulty  with  proving  the  existence  of  an  object  a  for  which  a 
property  P  holds  is:  how  do  we  find  the  particular  value  a!  In  the  above 
example,  why  did  we  choose  to  take  y  =  The  answer  in  this  case  -  as 

it  typically  will  be  -  lies  in  working  backwards.  Since  we  wanted  to  find  a 
_ 2 

value  y  such  that  yj^  =  x,  we  worked  from  this  equation: 

•  by  multiplying  both  sides  by  (y+ 1)  we  get  y— 2  =  a:(t/+l)  =  xy  +  x; 

•  by  rearranging  terms  to  get  all  (and  only)  terms  involving  y  on  one  side 
(i.e..  by  adding  2— xy  to  both  sides)  we  get  x+2  =  y—xy  =  y(l—x). 

•  Dividing  each  side  by  (1— a;)  -  noting  that  this  will  not  be  an  illegal 
division  by  zero,  since  the  premise  stipulates  that  x  ^  1  -  we  arrive  at 
the  value  we  seek:  y  =  fz3- 


Q 


Exercise  5 


19)  (Solution  on  page  430) 


Prove  that  for  every  real  a;>0  there  is  a  real  y  such  that  y(y+ 1)  =  x. 


Although  typically  the  case,  it  isn’t  strictly  necessary  (nor  sometimes 
even  possible)  to  explicitly  find  the  specific  value  x  which  witnesses  the  fact 
that  3 xP(x)-,  the  mere  fact  that  such  a  value  exists  is  all  that  needs  to  be 
demonstrated. 

Example  5.20j  A  Strange  Proof  of  Existence _ 

Fact:  There  are  irrational  numbers  a  and  b  such  that  ab  is  rational. 

Proof.  We  know  from  Example  5.6  that  is  irrational. 

Furthermore,  either  (\/2)V  is  rational  or  it  is  irrational. 

^/2 

•  Suppose  is  rational.  Let  a  =  b  =  \[2. 

Then  a  and  b  are  irrational,  and  ab  =  (v^)  ^  is  rational. 

•  Suppose  (\/2)V  is  irrational.  Let  a  —  (\/2)V  and  b=  y/2. 

Then  a  and  b  are  irrational,  and 

ab  =  ({VlfY  =  (V2)(^  =  (V2)2  =  2 

□ 


is  rational. 
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What  is  strange  about  this  example  is  that  we  demonstrated  the  existence 
of  two  particular  irrational  numbers  a  and  b  which  satisfy  our  conditions 
without  discovering  for  certain  what  these  particular  numbers  are! 


(^Exercise  5.2[f)  (Solution  on  page  431) 


Prove  that  3a;  (yP(x)  V  Q(a;))  O  3 xP{x)  V  3a :Q{x). 


(Hint:  refer  to  the  proof  in  Example  5.18.) 


5.6.3  Uniqueness 

There  are  two  approaches  to  proving  a  property  of  the  form  3 \xP(x),  the 
first  by  proving  existence  and  uniqueness  separately,  and  the  second  by 
combining  these  two  concerns. 

1.  First  prove  existence :  3a:P(a;) 

and  then  uniqueness:  MyMz  [(f(i/)  A  P(z))  =>  i/=z] . 

2.  Prove  3a;  [P(a;)  A  My  ( P(y )  =>  y  =  a:)] . 

Either  way,  the  proof  strategies  are  derived  from  existing  strategies. 


(^Example  5.21 


Prove  that  for  every  x  there  is  a  unique  y  such  that  x2y  =  x—y. 


Proof.  Let  y  = 


x2  +  1 ' 


aP  _  x(x2  +  1)  -  x  _ 


Then  x2y  =  — -  =  — — ^ -  =  x—y. 

y  x2  +  1  x2  +  1  y 

Furthermore,  if  x2z  =  x—z,  then  z(a;2  +  1)  =  x, 


so  z  =  ,  H  =  y. 
x2  +  1  y 


□ 


(^Example  5.22^) _ 

Suppose  T  is  a  family  of  sets.  Prove  that  there  is  a  unique  set  A  that  has 
the  following  two  properties: 

1.  TCV(A). 

2.  MB  (T  C  V{B)  =>  A  C  B). 


Proof.  Let  A  =  (J  T . 
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1.  Suppose  X  e  X. 

Then  X  C  (JX;  that  is,  X  C  A. 

Hence  X  e  V{A). 

2.  Suppose  B  is  any  set  satisfying  T  C  V(B). 

Let  a  £  A;  that  is,  a  £  U  X. 

Then  3X  £  T  with  a  £  X. 

Thus  X  £  V{B),  so  X  C  B. 

Hence  a  £  B.  □ 


(^Exercise  5.22^)  (Solution  on  page  431) 


Prove  that  there  is  a  unique  set  A  such  that,  for  every  set  B,  A  U  B  =  B. 


(pj)  Additional  Exercises 

1.  Prove  that,  for  any  two  real  numbers  a,  b  £  R:  if  a  <  b  then  a  <  a  ^ 

2.  Assume  that  m  and  n  are  integers.  Prove  that  if  m+n  is  even,  then 
m  and  n  are  either  both  even  or  both  odd. 

3.  Assume  that  n  is  an  integer.  Prove  that  if  3n  +  2  is  an  odd  integer, 
then  n  must  be  an  odd  integer. 

4.  Prove  that  there  is  no  even  prime  number  greater  than  2. 

5.  Prove  that  \/3  is  irrational. 

6.  Prove  or  disprove  each  of  the  following. 

(a)  The  sum  of  two  rational  numbers  is  rational. 

(b)  The  sum  of  two  irrational  numbers  is  irrational. 

7.  Assume  that  n  is  an  integer.  Prove  that  n2  >  n. 

8.  Prove  that  if  n  is  an  integer,  then  the  final  digit  of  n4  must  be  either 
0  or  1  or  5  or  6. 

9.  Prove  that  there  are  no  integer  solutions  to  the  equation  x2  +  2y2  =  24. 

10.  Prove  the  Distributivity  Laws  for  sets: 

(a)  An(BuC)  =  (An  B)  u  (An  C). 

(b)  Au(BnC)  =  (Aufl)n(AuC). 
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11.  Prove  the  following. 

(a)  P  =>  (Q  =>  P). 

(b)  ^  (Pvfi  ^  Qvfi). 

(c)  (P  =>  Q)  =>■  (PAfi  =t  QaR). 

12.  Prove  the  following. 

(a)  Vi(  P(i)  A  Q(i) )  o  Vx  P(x)  A  Va:<2(a;). 

(b)  3a;(  P(x)  V  Q(s) )  o  3a:P(a;)  V  3 xQ(x). 

(c)  \/x(P(x)\/  Q(x))  <=  \/x  P(x)  V  Va:<2(a;). 

(d)  3®(  P(z)  A  Q(a:) )  =>  3a;P(a;)  A  3a:<3(a;). 

13.  Prove  that  the  two  approaches  to  proving  3!a:  P(x)  from  Section  5.6.3 
are  equivalent. 


Chapter  6 
Functions 


Home  computers  are  being  called  upon  to  perform  many  new  func¬ 
tions,  including  the  consumption  of  homework  formerly  eaten  by  the 
dog. 


Doug  Larson. 


We  regularly  want  to  associate  to  each  value  of  one  set  A  some  particular 
value  taken  from  another  set  B  (which  may  be  the  same  set  A).  Such  a 
mapping  of  values  in  A  to  values  in  B  is  referred  to  as  a  function. 

Functions  arise  everywhere  in  people’s  lives.  For  example,  shoppers  are 
ever  calculating  (or  at  least  estimating)  for  themselves  the  cost  of  their  bas¬ 
ket  of  goods  from  the  number  and  unit  costs  (plus  relevant  sales  taxes). 
Functions  are  especially  relevant  to  the  computer  scientist’s  world.  Com¬ 
puter  programs  are  written  to  turn  input  values  into  output  values,  and  the 
design  and  implementation  of  Boolean  circuits  will  inevitably  start  from  a 
definition  of  the  function  of  the  circuit  which  describes  its  behaviour  on  each 
possible  input.  For  this  reason,  it  is  necessary  to  take  a  careful  look  at  what 
a  function  is  and  understand  its  definition  and  the  various  properties  that 
it  may  enjoy. 


Basic  Definitions 


A  function  f  from  a  set  A  to  a  set  B  is  an  assignment  of  exactly  one 
element  of  B  to  each  element  of  A.  We  write  /  :  A  — >  B  to  denote  that  / 
is  a  function  from  A  to  B,  and  we  write  /(a)  to  refer  to  the  unique  element 
of  B  assigned  to  the  element  a  of  A  by  the  function  /.  Thus  /  maps  each 
element  a  of  A  to  an  element  b  =  /(a)  of  B,  which  we  will  also  denote  by 
/  :  a  i->  b.  Figure  6.1  gives  a  pictorial  representation  of  such  a  function. 


Each  person  in  a  class  of  twelve  students  is  assigned  a  particular  grade, 
in  the  form  of  an  integer  percentage  between  0  and  100,  which  appears  as 
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follows  on  a  list  posted  on  a  bulletin  board: 


Andrews 

75 

Evans 

78 

Parker 

64 

Archer 

92 

Fletcher 

46 

Smith 

59 

Collins 

64 

Greene 

68 

Taylor 

100 

Davies 

88 

Lewis 

54 

Williams 

78 

Here,  each  person  in  the  set 

Class  =  { Andrews,  Archer,  Collins,  Davies,  Evans,  Fletcher, 
Greene,  Lewis,  Parker  Smith,  Taylor,  Williams} 

is  assigned  a  value  from  the  set 

Marks  =  {  0,  1,  2,  3,  4,  . . . ,  100}. 

This  describes  a  function 

score  :  Class  — >  Marks 

in  which,  for  example,  score(Greene)  =  68;  the  function  score  maps  the  value 
Greene  to  the  value  68,  that  is,  score  :  Greene  i— »  68. 


It  is  possible  for  a  function  /  :  A  — >  B  to  assign  the  same  value  from  B 
to  two  different  values  of  A.  In  the  above  example, 

score(Collins)  =  score(Parker)  =  64. 

However,  only  one  value  of  B  may  be  assigned  to  any  value  of  A.  In  this 
sense,  a  function  /  :  A  — >  B  may  be  viewed  as  a  machine  into  which  you 
input  a  value  x  E  A  and  -  depending  only  on  that  value  -  some  value 
f(x)  E  B  will  be  output  in  response: 


■H  / 


input  x 


->•  f(x)  output 
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If  the  same  value  for  x  is  input  on  two  separate  occasions  on  the  left,  then 
the  same  value  will  be  output  on  the  right  for  f(x)  on  both  occasions. 

If  /  :  A  — >  B  is  a  function  from  A  to  B,  we  refer  to  A  as  the  domain  of 
/  and  B  as  its  codomain.  If  /(a)  =  b  we  refer  to  a  as  an  argument  of  the 
function  /,  and  to  b  as  the  value  of  the  function  /  on  argument  a. 

If  the  domain  of  the  function  /  is  the  Cartesian  product  A1  x  A2  x  •  •  •  x  An , 
then  we  say  that  /  has  arity  n  or  that  /  takes  n  arguments.  A  function 
which  takes  two  arguments  is  called  a  binary  function.  Common  binary 
functions  are  often  written  in  infix  form  xfy  rather  than  f(x,y).  For 
example,  we  would  naturally  write  2  +  2  =  4  rather  than  +(2,  2)  =  4. 

The  range  of  the  function  /  :  A  — >  B,  denoted  range(/),  is  the  subset  of 
the  codomain  B  consisting  of  all  values  that  the  function  /  can  produce: 

range(/)  =  {/(a)  :  a  e  A}. 

Given  a  subset  5  C  A  of  the  domain  of  /,  the  image  of  5  under  /, 
denoted  by  /(S),  is  the  subset  of  the  codomain  B  consisting  of  all  values 
that  the  function  /  can  produce:  from  arguments  in  5: 

f(S)  =  {/(a)  :  aeS}. 

Thus,  for  example,  range(/)  =  /(A). 

Given  a  subset  T  C  B  of  the  codomain  of  /,  the  preimage  of  T  under  /, 
denoted  by  /_1(T),  is  the  subset  of  the  domain  A  consisting  of  all  arguments 
of  /  which  produce  values  in  T: 

f-\T)  =  {aeA  :  f(a)eT}. 

Notice  in  particular  that  /_1(B)  =  A,  since  every  argument  in  A  produces 
some  value  in  B.  We  can  also  note  that  images  and  preimages  allow  us  to 
view  /  and  /-1  as  functions  between  the  powersets  V  (A)  and  V  ( B ): 

/  :  V  (A)  —>  V  (B)  and  /-1  :  V  (B)  — »  V  (A). 


Example 


Consider  the  function  /  :  { 1,  2,  3}  -4  {  a,  b,  c  }  defined,  as  depicted  below, 
by  /( 1)  =  c,  /( 2)  =  a  and  /( 3)  =  c. 

/ 


The  domain  of  /  is  {  1,  2,  3}. 

The  codomain  of  /  is  {a,  b,  c}. 

The  range  of  /  is  {  a,  c  }. 

/({!.  2»  =  {a,c}  and  /({1,3})  =  {c}. 
/_1({6,  c})  =  {  1,  3  }  and  /_1({c})  =  {1,3}. 
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(Solution  on  page  431) 

What  is  the  range  of  the  function  score  from  Example  6.1? 

If  a  score  of  70  or  higher  is  considered  to  be  a  first-class  mark,  express 
the  set  of  students  who  have  scored  a  first-class  mark  as  a  preimage  of 
an  appropriate  set. 


Here  are  three  example  functions  defined  with  respect  to  an  arbitrary  set  A. 


1.  The  identity  function  id^  :  A  — ?  A  is  the  function  which  maps  each 
element  a  of  A  to  itself:  id^(a;)  =  x  for  all  x  e  A. 

2.  the  cardinality  function  |-|  :  Vfm  (A)  — >  N  maps  each  finite  subset 
of  A  to  the  number  of  elements  in  that  subset:  |X|  =  the  number 
of  elements  of  X.  (The  cardinality  of  a  set  is  simply  the  number  of 
elements  in  the  set.) 

Note  that  this  function  is  only  well-defined  on  finite  sets.  For  example, 
there  is  no  natural  number  n  which  denotes  the  number  of  elements 
in  the  set  N. 


3.  Given  a  subset  S  C  A  of  A,  its  characteristic  function  Xs  :  A  — >  B 
indicates  whether  or  not  an  object  is  an  element  of  5: 


Indicate  which  of  the  following  are  functions  from  the  set  Humans  of  all 
humans  to  itself.  For  each  that  is  not  a  function,  indicate  why  it  fails  to  be 
a  function. 


1.  Mother(x)  represents  the  mother  of  x. 

2.  Parent{x )  represents  the  parent  of  x. 

3.  Child{x)  represents  the  child  of  x. 

4.  FirstBornChild(x)  represents  the  first-born  child  of  x. 


Functions  are  common  in  mathematics,  where  they  are  typically  given  by  a 
formula.  For  example,  the  function  /  :  R  — >  R  defined  by 
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y 


Figure  6.2:  The  graph  of  the  function  f(x)  =  x3  —  x. 


f(x)  =  x3  —  x 

takes  a  real  value  x  £  R  and  returns  another  real  value  f(x)  e  R  which  is 
computed  from  x  by  the  formula  x3  —  x.  We  can  use  this  formula  to  calculate 

O 

the  value  of  f(x)  when  x  — 

f(3\  _  (3\3  3  _  27  3  _  15 

1  UJ  ~  UJ  "  2  ^  "8"  "  2  -  ~8~' 

Such  functions  are  typically  plotted  as  a  graph  on  the  xy- plane  as  in  Fig¬ 
ure  6.2,  where  we  have  indicated  the  point  on  the  graph. 


Motivated  by  the  above  example,  we  can  represent  a  function  /  :  A  — >  B 
from  A  to  B  as  a  set  of  pairs  over  the  Cartesian  product  Ax  B.  The  graph 
of  /,  denoted  graph(/),  is  the  set  of  all  pairs  (a,  b)  £  AxB  such  that  b  =  /(a). 
Thus,  for  every  a  £  A  there  is  exactly  one  b  £  B  such  that  (a,  b )  £  graph(/), 
namely  b  =  f(a).  As  an  example,  for  the  score  function  from  Example  6.1, 

graph(score)  =  {  (Andrews,  75),  (Archer,  92),  (Collins,  64), 

(Davies,  88),  (Evans,  78),  (Fletcher,  46), 

(Greene,  68),  (Lewis,  54),  (Parker,  64), 

(Smith,  59),  (Taylor,  100),  (Williams,  78)  }; 

and  for  f(x)  =  x3  —  x, 

graph(/)  =  {  (x,  x3  —  x)  :  x  £  R}. 
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The  graph  of  a  function  provides  a  complete  description  of  the  function,  in 
that  two  functions  defined  over  the  same  domain  and  codomain  are  equal  if, 
and  only  if,  their  graphs  are  equal.  This  is  easily  proven  in  the  following. 

(^Theorem  6.4J _ 

Let  f ,  g  :  A  — >  B  be  two  functions  defined  on  the  same  domain  and  codomain. 
Then  f(a)  =  g(a )  for  all  a  G  A  if,  and  only  if,  graph(f)  =  graph(g). 


Proof:  Suppose  that  /(a)  =  g(a)  for  all  a  6  A,  and  let  (a,  b)  G  A  x  B  be 
arbitrary.  We  need  to  show  that  (a,  b)  G  graph(/)  o  (a,  b )  G  graph(g).  But 

(a,  b)  G  graph(/)  o  b  =  f(a) 

O  b  =  g(a)  o  (a,  b)  G  graph(g). 

Suppose  now  that  graph(/)  =  graph(g),  and  let  a  G  A  be  arbitrary. 
We  need  to  show  that  /(a)  =  g(a).  But  (a, /(a))  G  graph(/),  and  since 
graph(/)  =  graph(g),  we  have  (a,  /(a))  G  graph(g),  and  hence  f(a)  =  g(a). 

□ 


(^Exercise  6.4^)  (Solution  on  page  432) 


What  is  the  graph  of  the  function  /  from  Example  6.2? 
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A  function  /  :  A  — >  B  associates  a  single  value  b  e  B  to  each  value  a  G  A, 
but  the  same  value  b  G  B  may  be  associated  to  more  than  one  value  in  A; 
that  is,  we  may  have  two  different  values  a,  a1  G  A  such  that  f(a)  —  f(a'). 
For  example,  given  the  function  f(x )  =  x3  —  x  there  are  three  values  of  x 
for  which  f(x)  =  0,  namely  x  —  —1,  x  =  0  and  x  =  1. 

If  a  function  does  not  assign  the  same  value  to  two  different  inputs,  it  is 
said  to  be  one-to-one  (1-1),  or  injective. 


(^Definition  6.4) 

A  function  f  :  A  — >  B  is  one-to-one  (1-1),  or  injective,  if,  and  only 
if,  f(a)  —  f(a')  implies  that  a  —  a'  for  all  a,  a'  G  A.  More  formally: 

V  a,  a'  G  A  ( f(a )  =  f(a')  — >  a  =  a') . 

In  other  words,  there  do  not  exist  two  different  values  in  A  which  f  maps 
to  the  same  value  in  B: 
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^3  a  £  A  3  a'  £  A  (/(a)  =  f(a')  A 


Exercise  6.5J  (Solution  on  page  432) _ 

Indicate  which  of  the  following  functions  are  one-to-one.  For  those  that  are 
not  one-to-one,  indicate  the  reason  that  they  fail  to  be  one-to-one. 

1.  The  function  score  :  Class  — >  Marks  from  Example  6.1. 

2.  The  function  /  :  R  — >  R  defined  by  f(x)  =  x2. 

3.  The  function  /  :  N  — >  N  defined  by  f(x)  =  x2. 


(^Definition  6.5^) 

A  function  f  :  A  — >  B  is  onto,  or  surjective,  if,  and  only  if,  its  range  is 
equal  to  its  codomain,  range(f)  =  B;  that  is,  every  value  b  e  B  is  the  image 
of  some  value  a  e  A: 

\/beB3aeA  ( f(a )  =  b ). 


Exercise  6.6 J  (Solution  on  page  432) _ 

Indicate  which  of  the  following  functions  are  onto.  For  those  that  are  not 
onto,  indicate  the  reason  that  they  fail  to  be  onto. 

1.  The  function  score  :  Class  — >  Marks  from  Example  6.1. 

2.  The  function  /  :  R  — >  R  defined  by  f(x)  =  x2. 

3.  The  function  /  :  N  — >  N  defined  by  f(x)  =  x2. 


A  function  /  :  A  — >  B  which  is  both  one-to-one  and  onto  is  particularly 
special:  it  defines  a  perfect  correspondence  between  the  sets  A  and  B,  in 
that  the  function  /  pairs  up  the  elements  of  A  and  B,  with  each  element  of 
one  set  paired  to  exactly  one  element  of  the  other  set.  Such  a  function  is 
referred  to  as  a  bijection. 

(^Definition  6.6^) 

A  function  f  is  a  bijection  if  it  is  both  one-to-one  and  onto. 


(^Exercise  6.7^)  (Solution  on  page  432) _ 

Indicate  which  of  the  functions  fu  f2,  f3  or  /4  depicted  by  the  following 
diagrams  are  one-to-one,  which  of  them  is  onto,  and  which  of  them  is  both 
(i.e. ,  a  bijection). 
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f* 


Let  /  :  A  — >  B  be  a  bijection.  Since  /  is  onto,  every  element  b  e  B  is 
the  image  of  some  element  a  e  A;  and  since  /  is  one-to-one,  every  element 
b  e  B  is  the  image  of  a  unique  element  a  e  A.  This  suggest  that  we  can 
turn  the  mapping  around,  to  invert  it,  and  associate  a  unique  element  of  A 
with  each  element  of  B. 

(^Definition  6.7^) 

If  f  is  a  bijection,  then  the  inverse  function  /-1  :  B  — >  A  is  the  function 
that  assigns  to  each  element  b  e  B  the  unique  element  a  e  A  such  that 
f(a )  =  b.  That  is,  /_1(6)  =  a  if,  and  only  if,  f(a )  =  b.  This  can  be  pictured 
as  follows: 


f 


Example  6.7 


The  function  /  :  R  — >  R  defined  by  f{x)  =  2a;  +  3  is  a  bijection,  with 
/-1(a;)  =  (y  —  3)/2.  For  example, 


/( 5)  =  2-5  +  3  =  13 


and 


/-1(13)  =  (13  —  3)/2  =  5. 


More  generally,  if  /  :  A  — >  B  is  one-to-one,  then  /  provides  a  bijection 
from  A  to  range(/),  and  we  can  define  the  inverse  function  /-1  :  range(/)  — > 
A. 
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Let  A  —  {  a,  b,  c,  . . . ,  z  }  be  the  set  consisting  of  the  usual  26  characters 
of  the  alphabet.  We  can  use  a  bijection  /  :  A  — »  A  as  the  basis  of  a  simple 
encryption  scheme.  For  example,  suppose  we  take  the  bijection  /  defined  as 
follows: 

abcdefghijklrn 

T  T  T  T  T  T  T  T  T  T  T  T  T 

V  k  t  e  cswubrnzv  l 

n  o  V  q  r  s  t  u  v  w  x  y  z 

TTTTTTTTTTTTT 

qSdPojfr  h  n  a  x  i 


To  encode  a  message  we  apply  the  function  /  to  each  letter  of  the  message. 
For  example,  the  message 

WE  ATTACK  AT  DAWN 

would  be  encoded  as 

NC  YFFYTZ  YF  EYNQ 

It  is  important  that  the  function  /  is  a  bijection.  No  two  letters  can  be 
mapped  to  the  same  letter,  as  otherwise  it  would  be  impossible  to  decode 
since  different  messages  would  give  rise  to  the  same  encrypted  text. 

In  order  to  decode  messages  that  we  receive  which  are  encoded  as  above, 
we  simply  apply  the  inverse  function  /-1  to  each  of  the  letters  of  the  en¬ 
crypted  text. 

This  encryption  method  is  insecure;  it  is  very  easy  to  decode  encrypted 
messages  even  if  you  don’t  know  the  function  /  with  which  they  are  en¬ 
crypted.  However,  the  idea  of  using  a  bijection  /  to  encode  messages,  thus 
allowing  such  messages  to  be  decoded  with  the  inverse  function  /-1,  is  fun¬ 
damental. 


Exercise  6.8  j  (Solution  on  page  433) 


What  is  the  inverse  of  the  function  /  of  Example  6.8? 
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If  we  have  a  function  /  :  A  — >  B  from  A  to  B  and  another  function  g  :  B  — >  C 
from  B  to  C,  we  can: 
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•  first  apply  the  function  /  to  some  argument  a  €  A  to  arrive  at  a  value 
b=f(a)eB ; 

•  then  use  the  value  b  —  f(a )  e  B  as  an  argument  to  the  function  g  to 
arrive  at  a  value  c  =  g(f(a ))  e  C. 

Composing  two  function  applications,  one  after  the  other,  is  a  very  common 
thing  to  do;  it  is  commonly  denoted  by  go  f,  and  can  be  pictured  as  follows: 

9°  f 


(^Definition  6.8^) _ 

Given  a  function  f  :  A  — >  B  from  A  to  B  and  a  function  g  :  B  — >  C  from  B 
to  C,  the  composition  of  g  and  f  is  the  function  g  o  f  :  A  — >  C  from  A  to 
C  defined  by 

(gof)(x)  =  g(f(x)). 


Note  that  the  co-domain  of  the  function  /  must  be  the  same  as  the  do¬ 
main  of  the  function  g  in  order  to  form  the  composition.  Also  note  carefully 
the  order  of  the  functions:  the  composition  g  o  f  of  the  functions  g  and 
/  first  applies  the  function  /  to  its  input  before  applying  the  function  g 
to  the  result.  The  reason  for  writing  g  o  f  rather  than  /  o  g  is  to  coin¬ 
cide  with  the  order  in  which  the  individual  function  applications  appear: 
(g  o  f)(x)  =  g(f(x)). 


(^Exercise  6.9^) 


(Solution  on  page  433) 


Consider  the  following  two  functions  /  and  g  from  { 1,  2,  3,  4}  to  itself: 
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Find  fog  and  g  o  f. 

If  /  :  A  — >  A  then  we  can  compose  /  with  itself.  In  this  case  we  typically 
write  f2  for  /  o  /,  and  more  generally  fn+1  =  f  o  /".  In  other  words, 

r  =  f°f°---°f 

n  times 

As  special  cases  we  have  f°  =  id^  and  f1  —  f,  noting  that  f  o  \dA  =  f  (see 
exercise  8,  page  177). 

If  we  compose  two  one-to-one  functions,  we  will  arrive  at  yet  another 
one-to-one  function.  The  same  is  true  of  onto  functions.  These  facts  are 
demonstrated  in  the  following  two  theorems. 

(^Theorem  6 . 9 

If  f  :  A  — >  B  and  g  :  B  — >  C  are  both  one-to-one,  then  so  is  g  o  f  :  A  — >  C. 


Proof:  Suppose  (g  o  f)(x)  =  (g  o  that  is,  g(f(x))  =  g(f{y)). 

What  we  need  to  demonstrate  is  that  x  =  y. 

Since  g  is  one-to-one,  f(x)  —  f(y). 

Hence,  since  /  is  one-to-one,  x  —  y.  □ 


(^Theorem  6.1cP) _ 

If  f  :  A  — >  B  and  g  :  B  — >  C  are  both  onto,  then  so  is  g  o  f  :  A  — >  C . 

Proof:  Suppose  c  £  C. 

What  we  need  to  demonstrate  is  that  c  =  (g  o  f)(a)  for  some  a  e  A. 
Since  g  is  onto,  c  =  g(b)  for  some  b  £  B. 

Since  /  is  onto,  b  =  f(a )  for  some  a  £  A. 

Hence  c  =  g(f(a))  =  (g  o  f)(a).  □ 


(^Exercise  6.1(T)  (Solution  on  page  433) 

Prove  that  if  /  :  A  — >  B  and  g  :  B  — >  C  are  both  bijections,  then  so  is 
gof:A^C. 

(^Exercise  6.11^)  (Solution  on  page  433) _ 


Prove  that  if  /  :  A  — >  B  is  a  bijection,  then  /  1  o  /  =  id^  and  /  o  /  1  =  idB. 
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^Exercise  6.12^)  (Solution  on  page  433) 

Prove  that  function  composition  is  associative:  if  /  :  A  — ?  B,  g  \  B  — >  C 
and  then  h  o  (g  o  /)  =  (h  o  g)  o  f. 


6.4)  Comparing  the  Sizes  of  Sets 

We  can  easily  compare  the  sizes  (the  cardinalities)  of  two  finite  sets  simply 
by  counting  their  elements;  the  size  of  one  is  greater  than  the  size  of  the 
other  if  it  contains  more  elements,  and  the  two  sets  are  the  same  size  if  they 
contain  the  same  number  of  elements. 

Counting  the  number  of  elements  in  a  finite  set  involves  listing  them  in 
some  arbitrary  order,  denoting  one  of  them  to  be  the  first  element,  another 
to  be  the  second  element,  and  so  on  to  the  last  element.  For  example,  we 
would  conclude  that  the  set  {  Joel,  Felix,  Oskar,  Amanda}  has  four  elements 
by  virtue  of  the  fact  that  we  could  find  a  one-to-one  and  onto  function  (a 
bijection) 

/  :  {  1,  2,  3,  4  }  — >  {  Joel,  Felix,  Oskar,  Amanda} 

which  effectively  lists  the  elements  of  the  set.  For  example,  the  function  / 
may  list  this  set  (alphabetically)  as  follows: 

/  :  1m-  Amanda 

2  m  Felix 

3  m  Joel 

4  m  Oskar 

This  bijection  demonstrates  that  the  two  sets  {  Joel,  Felix,  Oskar,  Amanda} 
and  {  1,  2,  3,  4}  are  the  same  size  (i.e.,  have  the  same  cardinality). 

We  can  compare  the  sizes  of  any  two  sets  by  trying  to  find  a  bijection 
between  them  which  would  demonstrate  that  the  two  sets  are  the  same  size. 
If  such  a  bijection  doesn’t  exist,  then  one  set  must  be  bigger  than  the  other. 
For  example,  if  we  try  to  find  a  bijection 

/  :  {Joel,  Felix,  Oskar,  Amanda}  m  {cola,  fanta,  sprite} 

we  would  quickly  realise  that  this  would  be  impossible,  as  no  such  function 
could  be  one-to-one:  some  element  of  the  second  set  would  have  to  be  the 
image  of  more  than  one  element  of  the  first  set  since  there  are  not  enough 
elements  in  the  second  set  to  go  around.  If  this  function  was  aimed  at 
providing  each  child  with  a  drink,  then  it  is  clear  that  some  drink  would 
have  to  be  shared. 

For  the  same  reason,  no  function 
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/  :  {cola,  fanta,  sprite}  — >  {Joel,  Felix,  Oskar,  Amanda} 

could  be  onto.  If  this  function  was  aimed  at  distributing  drinks  to  children, 
then  it  is  clear  that  at  least  one  child  would  not  get  a  drink. 

Given  two  arbitrary  sets  A  and  B,  we  would  naturally  consider  B  to  be 
at  least  as  large  as  A  if  we  could  find  a  one-to-one  function  /  :  A  — >  B,  since 
/  would  associate  each  element  of  A  with  its  own  element  of  B,  so  intuitively 
there  would  have  to  be  at  least  as  many  elements  of  B  as  there  are  of  A.  On 
the  other  hand,  we  would  naturally  consider  A  to  be  at  least  as  large  as  B 
if  we  could  find  an  onto  function  /  :  A  — >  B,  since  /  would  associate  each 
element  of  B  with  at  least  one  element  of  A  which  is  not  associated  with 
any  other  element  of  B,  so  intuitively  there  would  have  to  be  at  least  as 
many  elements  of  A  as  there  are  of  B.  Finally,  we  would  naturally  consider 
the  two  sets  to  be  of  the  same  size  (cardinality)  if  we  could  find  a  bijection 
/  :  A  — >  B  giving  a  direct  correspondence  associating  each  element  of  one 
of  the  sets  with  its  own  element  of  the  other  set.  We  will  denote  that  a  set 
A  is  no  bigger  than,  no  smaller  than,  and  the  same  size  as  B  by  A  A  B, 
A  A  B,  and  A  =  B,  respectively,  and  summarise  this  discussion  as  follows. 

(^Definition  6.12^) _ 

•  A  A  B  if,  and  only  if,  there  exists  a  one-to-one  function  f  :  A  — >  B. 

•  A  A  B  if,  and  only  if,  there  exists  an  onto  function  f  :  A  —>  B. 

•  A  =  B  if,  and  only  if,  there  exists  a  bijection  f  :  A  — >  B. 


The  following  results  show  that  these  definitions  make  sense  in  terms  of 
comparing  sizes  of  sets.  The  first  result  says  that  one  set  is  no  bigger  than 
a  second  if,  and  only  if,  the  second  is  no  smaller  than  the  first.  The  second 
result  says  that  two  sets  are  the  same  size  if,  and  only  if,  each  is  no  larger 
than  the  other. 

(^Theorem  6 . 1 2  ) _ 

A  A  B  if,  and  only  if,  BAA.  That  is,  there  exists  a  one-to-one  function 
f  :  A  — >  B  if,  and  only  if,  there  exists  an  onto  function  g  :  B  — >  A. 


Proof:  Suppose  that  /  :  A  — >  B  is  one-to-one,  and  fix  some  element 
a0  G  AJ  We  can  define  the  function  g  :  B  — »  A  as  follows: 

•  if  b  G  range(/)  then  b  =  f{a )  for  a  unique  value  a  G  A,  and  we  define 
g(b )  to  be  this  unique  value  a; 

•  if  b  ^  rang e(/)  then  we  define  g(b)  to  be  a0. 

1  If  no  such  a0  exists,  that  is  if  A  =  0,  then  0  trivially  represents  the  graph  of  a  one-to-one 
function  from  A  to  B ,  as  well  as  the  graph  of  an  onto  function  from  B  to  A. 
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This  function  g  is  onto,  as  a  —  g(f(a ))  for  each  element  a  e  A. 

Suppose  now  that  g  :  B  — >  A  is  onto.  For  each  value  a  G  A,  fix  some 
value  ba  such  that  g(ba )  =  a.  Then  the  function  /  :  A  — >  B  defined  as 
/(a)  =  ba  for  each  a  G  A  is  clearly  one-to-one.  □ 


0 


Theorem  6.13 


Schroder-Bernstein  Theorem 


A=B  if,  and  only  if,  A  A  B  and  BAA. 


Proof:  Suppose  we  have  functions  /  :  A  — >  B  and  g  :  B  — >  A  which  are 
both  one-to-one;  we  wish  to  construct  a  bijection  h  :  A  — >  B. 

For  any  a  g  A,  consider  the  sequence  generated  from  a  by  alternately 
applying  g_1  and  /-1  whenever  possible: 


a  g  \a)  i->  f  1(g  1(a))  ^  g  1[f  1{g  ^a))) 


This  is  possible  since  /  and  g  are  one-to-one,  and  hence  /-1  :  range(/)  — »  A 
and  g-1  :  range(g)  — >  B  are  well-define  functions.  However,  this  sequence 
may  stop  at  some  point,  either  at  an  element  of  A  not  in  the  range  of  g  (and 
hence  for  which  g~x  is  not  defined)  or  at  an  element  of  B  not  in  the  range 
of  /  (and  hence  for  which  /-1  is  not  defined). 

We  can  then  define  our  bijection  h  :  A  — >  B  as  follows: 


h(a ) 


'  g  V), 

if  the  sequence  generated  by  a 
ends  at  an  element  of  B\ 

.  /(“), 

otherwise. 

This  is  a  well-defined  function,  since  g~x(a)  will  be  defined  if  the  sequence 
generated  by  a  ends  at  an  element  of  B  (in  particular,  not  at  a).  It  remains 
to  demonstrate  that  this  function  is  one-to-one  and  onto. 

To  demonstrate  that  h  is  one-to-one,  let  us  assume  that  h(x )  =  h(y), 
and  show  that  we  must  have  x  =  y. 


•  If  the  sequences  generated  by  x  and  y  both  end  at  elements  in  B, 
then  h(x)  =  £7-1(a;)  and  h(y)  =  g_1{y),  so  g-1(a;)  =  g~1(y),  and  hence 
x  =  y. 

•  If  neither  sequence  generated  by  x  and  y  ends  at  an  element  of  B,  then 
h{x)  =  f(x)  and  h(y)  =  f(y),  so  f(x)  =  f(y),  and  hence  x  =  y. 

•  If  the  sequence  generated  by  x  ends  at  an  element  of  B,  but  not  so  for 
the  sequence  generated  by  y,  then  h(x)  =  g~1(x)  and  h(y)  =  f(y),  so 
g~1(x)  =  f(y).  But  then  y  =  f~1(g~1(x))  would  appear  (as  the  third 
element)  in  the  sequence  generated  by  x,  contradicting  the  assumption 
that  its  sequence  ends  differently  to  that  generated  by  x. 
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•  If  the  sequence  generated  by  y  ends  at  an  element  of  B,  but  not  so  for 
the  sequence  generated  by  x,  then  h(y )  =  g~1(y)  and  h{x)  =  f(x),  so 
g~1(y)  =  f(x).  But  then  x  =  /_1(g_1(?/))  would  appear  (as  the  third 
element)  in  the  sequence  generated  by  y,  contradicting  the  assumption 
that  its  sequence  ends  differently  to  that  generated  by  y. 

To  demonstrate  that  h  is  onto,  let  us  assume  that  b  e  B,  and  show  that 
we  must  have  b  =  h(a)  for  some  a  e  A. 

•  If  the  sequence  generated  by  g(b )  ends  at  an  element  of  B,  then 

KgiP))  =  ;9_1(s(&))  =  b. 

•  If  the  sequence  generated  by  g(b )  does  not  end  at  an  element  of  B, 
then  /_1(f>)  must  be  defined  and  appear  (as  the  third  element)  in  the 
sequence  generated  by  g(b),  and  hence  /i(/_1(6))  =  /(/_1(f>))  =  b.  □ 


These  definitions  are  unremarkable  for  finite  sets,  but  reveal  surprising 
relationships  between  infinite  sets,  as  the  following  example  demonstrates. 


(^Example  6.13^) _ 

The  set  N  of  nonnegative  integers  in  some  sense  contains  almost  twice  as 
many  elements  as  the  set  E  =  {  0,  2,  4,  . . .  }  of  nonnegative  even  integers. 
However,  the  function  /  :  N  — >  E  defined  by  /(n)  =  2 n  provides  a  bijection 
from  N  to  E,  demonstrating  that  there  are  in  fact  the  same  “number”  of  even 
integers  as  there  are  integers.  This  bijection  can  be  pictured  as  follows: 

/:0123456789  10 

TTTTTTTTTTT--- 

0  2  4  6  8  10  12  14  16  18  20 


The  confusion  arising  from  the  above  example  is  with  the  idea  of  the 
“number”  of  elements  of  an  infinite  set.  There  are  in  fact  an  infinite  num¬ 
ber  of  objects  in  each  of  the  sets,  and  as  such  there  is  no  problem  with 
considering  them  to  have  the  same  cardinality. 

Realising  that  the  set  of  even  integers  is  no  smaller  than  the  set  of  all 
integers,  it  may  seem  that  one  infinite  set  is  as  big  as  any  other.  In  fact, 
some  infinite  sets  are  larger  than  others.  To  explore  this  idea,  we  start  with 
the  following  definitions. 

(^Definition  6.13^) 

A  set  A  is  said  to  be  finite  if,  and  only  if,  there  is  a  bijection 
f  '■  {1,  2,  3,  . . . ,  n}  — >  A 
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for  some  n  e  N.  This  function  effectively  lists  all  of  the  elements  of  A,  and 
the  value  n  is  the  cardinality  of  A:  \A\  =  n. 

A  set  A  is  said  to  be  countably  infinite  if,  and  only  if,  there  is  a 
bijection 

f  :  N  ->  A. 

This  function  lists  the  elements  of  A  in  an  infinite  list. 

Finally,  a  set  is  said  to  be  countable  if,  and  only  if,  it  is  finite  or  count¬ 
ably  infinite;  and  it  is  said  to  be  uncountable  if,  and  only  if,  it  is  not 
countable. 


(^Example 


The  set  of  integers  Z  is  countable.  A  bijection  /  :  N  — >  Z  witnessing  this 
fact  can  be  defined  as 

(  if  n  is  odd; 

Kn)  =  | 

[  —  Vf  if  n  is  even. 

This  function  would  list  the  integers  as  follows: 


/:0123456789  10 

TTTTTTTTTTT 

0  1  -1  2  -2  3  -3  4  -4  5  -5 


Clearly  this  function  is  one-to-one  and  onto,  as  every  integer  will  appear 
exactly  once  in  this  list. 


(^Exercise  6.14)  (Solution  on  page  433) _ 

What  is  the  inverse  /-1  :  Z  — >  N  of  the  bijection  /  :  N  — >  Z  given  in 
Example  6.14? 

(^Exercise  6.15^)  (Solution  on  page  434) 

Prove  that  A  =  B  for  any  two  countable  sets  A  and  B. 

That  is,  given  bijections  /  :  N  — >  A  and  g  :  N  — >  B,  show  how  to 
construct  a  bijection  h  :  A  — >  B. 


As  an  example  of  the  difference  between  countable  and  uncountable  sets, 
we  shall  see  that  there  are  far  more  numbers  on  the  real  number  line  than 
just  the  integers;  that  is,  the  set  of  real  numbers  R  is  uncountable.  This 
may  seem  perfectly  sensible,  as  these  numbers  fill  the  number  line:  between 
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any  two  different  real  numbers,  no  matter  how  close  they  are  to  each  other, 
you  can  find  a  third.  The  integers,  on  the  other  hand,  are  relatively  few  and 
far  between. 

While  such  an  intuitive  argument  gives  rise  to  a  valid  result  in  this  case, 
the  same  intuition  would  lead  you  to  believe  that  there  are  uncountably- 
many  rational  numbers,  as  between  any  two  different  rational  numbers, 
no  matter  how  close  they  are  to  each  other,  you  can  find  a  third.  How¬ 
ever,  before  we  demonstrate  that  there  are  uncountably-many  reals,  we  first 
demonstrate  that  this  intuition  about  the  rationals  is  faulty;  the  rationals 
are  countable,  and  hence  no  more  numerous  that  the  integers. 


The  set  Q+  of  positive  rational  numbers  is  countable.  To  see  this,  we  need 
to  find  a  bijection 


/  :  N  ->  Q+ 

which  completely  lists  them.  To  this  end,  we  first  note  that  a  positive 
rational  is  a  number  of  the  form  2,  where  p  and  q  are  positive  integers,  and 
we  can  arrange  these  in  an  infinite  number  of  infinite  rows  by: 

•  listing  all  the  rationals  with  numerator  p  =  1  in  the  first  row, 

•  listing  all  the  rationals  with  numerator  p  =  2  in  the  second  row, 

•  listing  all  the  rationals  with  numerator  p  =  3  in  the  third  row, 

and  so  on  as  depicted  in  Figure  6.3.  We  can  then  zigzag  diagonally  through 
this  arrangement  as  depicted  in  Figure  6.3,  listing  the  rationals  in  the  order 
in  which  they  are  encountered.  However,  we  only  list  rationals  that  appear 
in  lowest  form,  and  ignore  those  (depicted  crossed  out  in  grey  circles  in 

Figure  6.3)  that  are  not  in  lowest  form;  for  example,  we  do  not  include  g  in 

.  .  .  .  .  .  .  0 
our  listing  as  it  will  have  already  appeared  earlier  m  our  list  as  j. 

The  resulting  listing  provides  the  required  bijection  /  :  N  — >  Q+ : 
/:0123456789  10 

TTTTTTTTTTT--- 

1  2  1  1  3  4  3  2  1  1  5 

TT23TT2345T 

This  function  is  one-to-one  and  onto  as  only  rationals  ^  in  lowest  form 
appear  in  the  list  and  each  of  these  is  encountered  once  and  only  once  while 
zigzagging  through  the  arrangement. 

Extending  this  result  to  show  that  the  set  Q  of  all  the  rational  numbers 
is  countable  is  straightforward. 
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Prove  that  the  set  Q  of  all  rationals  is  countable. 


The  set  [0, 1]  of  nonnegative  real  numbers  no  greater  than  1  is  uncountable. 
To  see  this  we  must  show  that  no  bijection  /  :  N  — »  [0, 1]  exists.  To  this 
end,  assume  that  /  is  such  a  function,  and  consider  the  listing  of  the  real 
numbers  that  it  gives: 


/  : 

0 

i — y 

0  . 

[d-oo] 

^01 

do  2 

do3 

^04 

dos 

1 

i — y 

0  . 

dio 

[du] 

d\2 

du 

dl4 

d  15 

2 

i — ^ 

0  . 

^20 

d2i 

[  dss  j 

d23 

^24 

^25 

3 

i — y 

0  . 

^30 

^31 

do2 

^34 

^35 

4 

i — y 

0  . 

^40 

^41 

di2 

^43 

[du] 

^45 

5 

!->• 

0  . 

^50 

^51 

dh2 

^53 

^54 

( dss) 

Each  number  in  this  list,  being  a  nonnegative  real  number  no  greater  than 
1,  is  given  by  an  infinite  decimal  expansion  with  a  leading  0.  In  particular, 
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the  value  0  appears  as  0.00000  •  •  •  and  the  value  1  appears  as  0.99999  •  •  •. 
Consider  now  the  real  number 

r  =  0 .r1r2r3rir5  ■  ■  ■ 

in  which  the  ith  decimal  digit  is  given  by 

Ti  =  (du  +  5)  mod  10. 

That  is,  the  ith  decimal  digit  of  r  is  defined  to  differ  by  5  from  the  ith 
decimal  digit  of  /(i). 

Assuming  that  the  function  /  above  is  indeed  a  bijection,  and  in  partic¬ 
ular  onto,  the  value  r  must  appear  somewhere  in  the  list;  that  is,  we  must 
have  r  —  f(n)  for  some  n  e  N.  However,  for  each  n,  r  differs  (by  5)  from 
/(n)  in  the  nth  decimal  place,  meaning  that  we  cannot  have  r  =  /(n). 


An  infinite  set  may  thus  be  either  countably  infinite  or  uncountably 
infinite.  In  Exercise  6.15  we  saw  that  any  two  countably-infinite  sets  are  the 
same  size,  but  the  same  is  not  true  of  two  uncountable  sets.  The  following 
exercise  demonstrates  that  given  any  set,  no  matter  how  big,  you  can  always 
construct  an  even  bigger  set  by  merely  taking  its  powerset. 


Exercise  6.17 J  (Solution  on  page  434) _ 

Show  that  the  powerset  V  (A)  of  any  set  A  is  strictly  larger  than  A,  by 
showing  that  no  function  /  :  A  — >  V  (A)  can  be  onto. 

(Hint:  Show  that  the  set  B  =  {x  e  A  :  x  £  f(x)  }  is  different  from  /(a) 
for  all  a  £  A.) 
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In  this  section,  as  an  example  in  working  with  sets,  we  prove  an  important 
result  on  the  existence  of  (greatest  and  least)  fixed  points  of  monotonic  func¬ 
tions  defined  on  the  powerset  of  a  given  set.  We  also  describe  a  procedure 
for  calculating  these  fixed  points. 


a 


Definition  6.17 


Let  5  be  a  set,  and  let  f  :  V  ( S )  — >  V  ( 5 )  be  a  function  which  maps  subsets 
of  S  to  subsets  of  S. 


•  f  is  monotonic  if,  and  only  if,  f(A )  C  f(B)  whenever  AC  B. 

•  A  C  S  is  a  fixed  point  of  f  if,  and  only  if,  f(A)  =  A. 
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•  A  C  S  is  the  greatest  fixed  point  of  f  -  denoted  gfp(f)  -  if,  and 
only  if,  A  is  a  fixed  point  (ie,  f(A)  s=  A)  and  A  is  larger  than  all  other 
fixed  points:  if  f(B)  =  B  then  B  C  A. 

•  A  C  S  is  the  least  fixed  point  of  f  -  denoted  lfp(f)  -  if,  and  only  if, 
A  is  a  fixed  point  (ie,  f(A)  =  A)  and  A  is  smaller  than  all  other  fixed 
points:  if  f(B )  =  B  then  AC  B. 


Note  that  fixed  points  need  not  exist;  and  even  if  they  do  exist,  then 
there  is  no  guarantee  that  greatest  and/or  least  fixed  points  exist. 

Example  6.17^) 

Let  5  =  {  0  }  and  define  /  :  V  (5)  ->  V  (5)  by  /(0)  =  S  and  f(S)  =  0. 
Clearly  /  does  not  have  a  fixed  point. 


Exercise  6.18/  (Solution  on  page  434) _ 

Define  a  function  /  :  V  ( S )  — >  V  (5)  over  the  set  S  =  {  1,  2  }  which  has  two 
fixed  points  which  are  neither  greatest  nor  least  fixed  points. 


The  following  result,  however,  shows  that  both  greatest  and  least  fixed 
points  exist  for  monotonic  functions. 

(^Theorem  6.18^)  Knaster-Tarski  Theorem 

If  f  :  V  (S)  — >  V  ( 5 )  is  monotonic,  then  f  has  both  greatest  and  least  fixed 
points.  Furthermore,  these  can  be  defined  as  follows: 

•  gfp(f)  =  U{ACS  :  A  C  f(A) };  and 

•  =  fl{A  C  S  :  f(A)CA}. 


Proof:  We  will  prove  the  result  about  the  greatest  fixed  point  gfp(f)  and 
leave  the  result  about  the  least  fixed  point  lfp(f)  as  an  exercise  (Exercise  12, 
page  178). 

To  this  end,  letG  =  U{^4^5  :  AC  f(A) }  as  in  the  Theorem.  We 
first  demonstrate  that  G  C  /(G)  by  showing  that  given  any  a  G  G  we  must 
have  that  a  e  /(G). 

Suppose  oeG.  By  the  definition  of  G,  this  means  that  a  E  A  for  some 
ACS  such  that  A  C  f(A).  Hence  a  G  }(A).  Moreover,  A  C  G  (as 
G  is  the  union  of  all  such  sets),  so  by  the  monotonicity  of  /  we  have 
that  f(A)  C  /(G).  Hence  a  G  /(G)  as  required. 

Next,  we  demonstrate  the  reverse  inclusion,  that  /(G)  C  G. 


The  Knaster-Tarski  Theorem  175 


Since  we’ve  shown  that  G  C  /(G),  by  the  monotonicity  of  /  we  have 
that  /(G)  C  /(/(G)).  This  means  that  /(G)  is  one  of  the  sets  in  the 
family  of  sets  whose  union  is  G,  and  hence  /(G)  C  G. 

We’ve  thus  shown  that  G  is  a  fixed  point  of  /.  It  remains  to  show  that  it 
is  the  greatest  fixed  point.  To  this  end,  suppose  that  X  is  any  fixed  point 
of  /.  Since  X  C  f(X),  X  is  one  of  the  sets  in  the  family  of  sets  whose  union 
is  G,  and  hence  ICG.  □ 

Beyond  knowing  that  greatest  and  least  fixed  points  of  /  exist,  we  would 
like  to  know  how  to  calculate  them  without  having  to  calculate  f(A)  for  all 
subsets  ACS.  To  do  this,  we  can  exploit  the  following  observations. 

(^Theorem  6.1fP) _ 

For  all  n  E  N, 

1.  /"(0)  C  /"+1(0)  and  /"(0)  C  lfp(f); 

2.  /»(S)  D  r+1(5)  and  /"(S)  D  gfp(f). 

From  this  we  can  deduce  the  following: 

(a)  U«.;/10)  C  lfp(f)  and  U „eN/B(S)  5  gfp{f); 

(b)  If  /"(0)  =  /n+1(0)  then  lfp(f)  =  /”(0); 

(c)  If  /“(S)  =  then  gfp(f)  =  /"(5); 

(d)  If  |S|  =  n  then  lfp(f)  =  /"(0)  and  gfp(f)  =  fn(S). 


Proof:  We  prove  only  1.,  by  straightforward  induction,  and  leave  2.  and 
the  corollaries  (a)-(d)  as  exercises  (Exercise  13,  page  178). 

For  the  base  case,  /°(0)  =  0,  so  clearly  /°(0)  C  /1(0)  and  /°(0)  C  lfp(f). 
For  the  induction  case,  assumingthat  /n_1(0)  C  /"(0)  and  that  /”-1(0)  C 

•  r(0)  =  /(/B_1(0))  S  /(/"(0))  =  /n+1(0);  and 

•  /B(0)  =  i{fn~xm  C  /(IA(/))  =  i/p(/).  □ 


Thus,  in  order  to  calculate  the  least  fixed  point  l/p(/)  of  /,  we  can 
repeatedly  apply  /  starting  from  the  empty  set  0  until  we  arrive  at  a  fixed 
point,  which  by  above  will  be  lfp(f ): 

0  =  /°(0)  C  /’ (0)  C  /2(0)  C  ...  C  /»(0)  =  /n+1(0)  = 

A  similar  procedure,  starting  from  5,  will  give  us  the  greatest  fixed  point. 
This  is  guaranteed  to  work  if  the  set  5  is  finite;  however,  if  5  is  infinite,  we 
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may  generate  infinite  sequences  of  sets  which  approach  yet  never  reach  the 
fixed  points. 

(Exercise  6.2[T)  (Solution  on  page  434) 

Let  /  :  V  (N)  -4  V  (N)  be  defined  by  f(S)  =  {  0  }  U  {  n+2  :  n  %  S  }. 

1.  Prove  that  /  is  monotonic. 

2.  Show  that  /"(0)  C  /n+1(0)  and  /"( N)  D  /n+1(N)  for  each  n  e  N. 

3.  Determine  the  least  and  greatest  fixed  points  lfp(f)  and  gfp(f). 
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1.  Identify  the  domain,  codomain,  and  range  of  the  following  functions. 

(a)  the  function  that  assigns  to  each  nonnegative  integer  the  least 
prime  number  greater  than  it. 

(b)  the  function  that  assigns  to  each  pair  of  positive  integers  the 
maximum  of  these  two  values. 

2.  Let  A  =  {  1,  2,  3,  4}  and  B  =  {a,  b,  c,  d},  and  let  flt  f2  and  /3  be 
functions  from  A  to  B  with  the  following  graphs: 

graph(/i)  =  {  (1,  d),  (2,  a),  (3,  c),  (4,  c)  } 

graph(/2)  =  {  (1,  d),  (2,  c),  (3,  a),  (4,  b)  } 

graph(/3)  =  {(1,6),  (2,c),  (3,  a),  (4,  d)  } 

Indicate  which  of  these  functions  are  one-to-one,  which  are  onto,  and 
which  are  bijections. 

3.  Give  an  example  of  a  function  from  N  to  N  that  is 

(a)  one-to-one  but  not  onto. 

(b)  onto  but  not  one-to-one. 

(c)  one-to-one  and  onto,  but  which  does  not  map  any  value  to  itself. 

(d)  neither  one-to-one  nor  onto. 

4.  Find  all  functions  from  X  —  {a,  6}  to  Y  =  {1,  2,  3}.  In  each  case, 
indicate  whether  or  not  the  function  is  one-to-one,  and  whether  or  not 
it  is  onto. 

5.  Define  the  function  /  :  [0,1]  — >  (0,1)  by:  /( 0)  =  1/2;  /(1/n)  = 
l/(n+2)  for  all  positive  integers  n;  and  f(x)  =  x  otherwise.  Prove 
that  /  is  a  bijection. 
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6.  Consider  the  following  three  functions  /,  g  and  h  from  {1,  2,  3}  to 
itself: 


Find  fog,gof,foh,  hof,goh,  hog. 

7.  Find  go  f  and  fog,  where  f(x)  =  x2+l  and  g(x)  =  x—2  are  functions 
from  R  to  R 

8.  Prove  that  for  any  function  /  :  A  — »  B,  f  =  /  o  id^  and  /  =  idB  o  /, 
where  id^  :  X  — >  X  is  the  identity  function  on  X,  that  is,  f(x)  =  x 
for  all  x  e  X. 

9.  Assuming  that  /  :  A  — >  B  and  g  :  B  — >  C,  prove  or  disprove  the 
following. 

(a)  If  /  and  go  f  are  both  one-to-one,  then  g  must  also  be  one-to-one. 

(b)  If  g  and  go  f  are  both  one-to-one,  then  g  must  also  be  one-to-one. 

(c)  If  /  and  g  o  f  are  both  onto,  then  g  must  also  be  onto. 

(d)  If  g  and  g  o  f  are  both  onto,  then  g  must  also  be  onto. 

10.  Prove  that  if  A  C  B  and  B  is  countable,  then  A  is  countable. 

11.  In  Example  6.15  we  saw  how  to  construct  a  function  /  :  N  — >  <Q>+ 
which  listed  all  of  the  positive  rational  numbers  by  zigzagging  through 
an  infinite  array  of  rational  numbers.  However,  we  had  to  disregard  the 
rational  numbers  that  we  came  across  which  were  not  in  lowest  terms. 
In  this  exercise  we  explore  an  alternative  approach  which  avoids  this 
complication. 

Consider  the  tree-like  diagram  in  Figure  6.4  which  is  constructed  by 
starting  with  1/1  at  the  top,  and  from  each  branching  point  labelled 
i/j  drawing  a  left  branch  labelled  i/{i+j )  and  a  right  branch  labelled 
{}+])/]■ 

Argue  that  every  positive  rational  number  appears  exactly  once  in  this 
tree,  by  arguing  that  each  of  the  following  is  true. 


(a)  Every  node  is  labelled  by  a  rational  number  in  lowest  form. 

(b)  Every  rational  number  appears  somewhere  in  the  tree. 

(c)  No  rational  number  appears  twice  in  the  tree. 
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Thus,  to  list  the  rational  numbers  without  repetition  we  need  merely 
list  the  successive  rows  of  the  tree. 

12.  Prove  the  second  part  of  Theorem  6.18  from  page  174,  that  L  as  defined 
there  is  the  least  fixed  point  of  /. 

13.  (a)  Prove  the  second  part  of  Theorem  6.19  (page  175). 

(b)  Prove  the  four  corollaries  (a)-(d)  to  Theorem  6.19  (page  175). 


Chapter  7 
Relations 


It  is  a  melancholy  truth  that  even  great  men  have  their  poor  rela¬ 
tions. 

-  Charles  Dickens,  Bleak  House. 

In  previous  chapters  we  looked  at  grouping  objects  together  into  sets,  as  well 
as  logics  to  reason  about  the  elements  in  a  set.  We  also  studied  functions 
/  :  A  — >  B  mapping  elements  in  one  set  A  to  elements  in  another  set  B. 

In  this  chapter  we  shall  turn  our  attention  towards  more  general  rela¬ 
tionships  between  elements  of  sets  than  simple  mappings.  Some  everyday 
examples  of  such  relationships  are  “parenthood”  amongst  the  set  of  people 
(“A  is  a  parent  of  B”)  and  “divisibility”  amongst  the  set  of  integers  (“x 
divides  evenly  into  y”).  More  generally,  relationships  can  exist  between 
elements  of  different  sets,  such  as  the  “enrolment”  relationship  between  the 
sets  of  students  and  courses  (“student  s  takes  course  c”).  Relationships 
may  even  exist  amongst  elements  of  three  or  more  sets,  such  as  the  “grade” 
relationship  between  students,  courses  and  grades  (“student  s  got  a  grade 
of  g  in  course  c”). 


(7/1)  Basic  Definitions 

We  start  by  recalling  that  the  truth  set  of  a  predicate  such  as 

S(x,y,z )  =  “student  x,  in  course  y ,  scored  a  grade  of  z ” 

denotes  a  subset  of  a  Cartesian  product,  in  this  case  S  x  C  x  G,  where  5, 
C  and  G  are  the  sets  of  students,  courses,  and  grades,  respectively.  In  this 
example,  S(s,  c,g )  is  true  if,  and  only  if,  s  is  a  student  who  scored  a  grade 
of  g  in  course  c;  and  the  truth  set  for  this  property  is 

Grades  =  {(s,c,  g)  :  s  is  a  student  who 

scored  a  grade  of  g  in  course  c}. 

An  n-ary  relation  R  is  just  such  a  subset  of  n-tuples.  In  the  above  exam¬ 
ple,  the  set  Grades  is  a  ternary  (that  is,  a  3-ary)  relation  over  5  x  C  x  G: 
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Grades  C  S  x  C  x  G. 


The  most  obvious  use  of  n-ary  relations  is  in  representing  databases.  For 
example,  the  above  relation  Grades  might  represent  a  particular  University’s 
database  of  students’  course  grades. 


The  Internet  Movie  Database  (IMDb)  http://www.imdb.com  is  a  Web  site 
which  contains  a  massive,  and  ever-increasing,  online  database  of  films  and 
TV  shows,  associating  with  each  of  these  its  actors  and  production  crew  per¬ 
sonnel  (directors,  writers,  producers,  etc),  as  well  as  many  other  attributes 
such  as  year  of  release  and  genre. 

For  example,  the  table  in  Figure  7.1  represents  the  database  of  James 
Bond  films,  recording  their  title,  year  of  release,  starring  actor,  and  director. 
This  is  a  fraction  of  the  information,  presented  in  tabular  form,  delivered  by 
IMDb  as  a  result  of  a  search  on  the  term  “James  Bond”.  It  can  be  viewed 
as  a  relation 

BondFilms  C  Titles  x  N  x  Names  x  Names 
over  the  sets 

Titles  =  film  titles; 

N  =  natural  numbers  representing  years; 

Names  =  names  of  people; 

containing  the  20  records  (i.e. ,  4-tuples): 

rOl  =  ( Dr.  No ,  1962  ,  Sean  Connery  ,  Terence  Young ) 
r02  =  ( Thunderball ,  1965  ,  Sean  Connery  ,  Terence  Young ) 


r 20  =  ( Skyfall ,  2012  ,  Daniel  Craig ,  Sam  Mendes ). 

The  main  use  to  which  such  a  database  is  put  is  for  being  queried.  As  an 
example,  we  may  wish  to  query  the  database  to  find  out  which  James  Bond 
films  star  Roger  Moore.  The  answer  to  this  query  would  be  a  particular  set 
of  records: 

Q  =  {r  e  BondFilms  :  r  stars  Roger  Moore} 

=  {r06,  r 07,  r 08,  rlO,  rll  }. 


Exercise  7.1 )  (Solution  on  page  435) 


Express  and  answer  the  following  queries  about  the  above  database  of  James 
Bond  Films. 
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Title 

Year 

Star 

Director 

rOl 

Dr.  No 

1962 

Sean  Connery 

Terence  Young 

r02 

Thunderball 

1965 

Sean  Connery 

Terence  Young 

r  03 

You  Only  Live  Twice 

1967 

Sean  Connery 

Lewis  Gilbert 

r04 

On  Her  Majesty’s 
Secret  Service 

1969 

George  Lazenby 

Peter  R.  Hunt 

r05 

Diamonds  Are  Forever 

1971 

Sean  Connery 

Guy  Hamilton 

r06 

The  Spy  Who 

Loved  Me 

1977 

Roger  Moore 

Lewis  Gilbert 

r07 

Moonraker 

1979 

Roger  Moore 

Lewis  Gilbert 

r08 

For  Your  Eyes  Only 

1981 

Roger  Moore 

John  Glen 

r09 

Never  Say 

Never  Again 

1983 

Sean  Connery 

Irvin  Kershner 

rlO 

Octopussy 

1983 

Roger  Moore 

John  Glen 

rll 

A  View  to  a  Kill 

1985 

Roger  Moore 

John  Glen 

rl2 

The  Living  Daylights 

1987 

Timothy  Dalton 

John  Glen 

rl3 

Licence  to  Kill 

1989 

Timothy  Dalton 

John  Glen 

rl4 

Golden  Eye 

1995 

Pierce  Brosnan 

Martin  Campbell 

rl5 

Tomorrow  Never  Dies 

1997 

Pierce  Brosnan 

Roger 

Spottiswoode 

rl6 

The  World  Is 

Not  Enough 

1999 

Pierce  Brosnan 

Michael  Apted 

rl7 

Die  Another  Day 

2002 

Pierce  Brosnan 

Lee  Tamahori 

rl8 

Casino  Roy  ale 

2006 

Daniel  Craig 

Martin  Campbell 

rl9 

Quantum  of  Solace 

2008 

Daniel  Craig 

Marc  Forster 

r20 

Skyfall 

2012 

Daniel  Craig 

Sam  Mendes 

Figure  7.1:  James  Bond  Films. 


1.  Which  Bond  films  were  directed  by  Lewis  Gilbert? 

2.  Which  Bond  Films  were  released  in  the  1970s? 


(7\2)  Binary  Relations 

Binary  (that  is,  2-ary)  relations  are  the  most  common  types  of  relations, 
and  are  of  particular  importance.  Concepts  such  as 


order  (“element  a  comes  before  element  b”), 
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•  equivalence  ("element  a  is  the  same  as  element  b”),  and 

•  function  (“input  a  results  in  output  b”) 

are  all  examples  of  binary  relations,  relating  one  thing  a  to  another  thing  b. 
They  are  often  written  in  infix  style,  so  that  we  would  write  aRb  rather 
than  (a,  b)  E  R. 

A  binary  relation  R  C  A  x  B  is  thus  just  a  set  of  ordered  pairs,  and  is 
said  to  be  a  relation  from  the  set  A  to  the  set  B.  The  sets  A  and  B  are 
referred  to  as  the  source  and  target,  respectively,  of  R. 

A  binary  relation  R  C  A  x  A  from  a  set  A  to  itself  is  said  to  be  a  relation 
on  A.  In  this  case,  the  relation  is  said  to  be  homogeneous ,  whereas  a 
relation  R  C  A  x  B  with  AfiB  is  said  to  be  heterogeneous . 


As  an  example  of  a  binary  relation  on  the  natural  numbers  N  we  can  take 
the  usual  less-than-or-equal-to  relation  <  C  NxN: 


<  =  {(x,y)  ■  Z  <  y} 

=  {(0,0),  (0,1),  (1,1),  (0,2),  (1,2),  (2,2), 

As  an  example  of  a  binary  relation  from  the  set  H  of  humans  to  the 
natural  numbers  N  we  can  take  the  relation  R  C  H  x  N  given  by: 

R  =  {(x,n)  E  H  x  N  :  x  has  n  children  }. 

As  an  example  of  a  binary  relation  from  the  set  C  of  cities  to  the  set  N 
of  countries  (nations)  we  can  take  the  relation  R  C  C  x  N  given  by: 

R  =  {  (c,  n)  E  C  x  N  :  c  is  located  in  n  }. 


Joel  likes  mint  ice  cream  and  coffee  ice  cream;  Felix  likes  vanilla  ice  cream 
and  cherry  ice  cream;  Oskar  likes  vanilla  ice  cream  and  chocolate  ice  cream; 
and  Amanda  likes  chocolate  ice  cream  and  mint  ice  cream.  These  properties 
can  be  related  by  the  binary  relation 

Likes  C  Children  x  Flavours 

where 

Children  =  { Joel,  Felix,  Oskar,  Amanda}  and 
Flavours  =  {Vanilla,  Chocolate,  Coffee,  Cherry,  Mint} 

consisting  of  the  following  ordered  pairs: 
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Likes  =  {  (Joel,  Mint),  (Joel,  Coffee), 

(Felix,  Vanilla),  (Felix,  Cherry), 

(Oskar,  Vanilla),  (Oskar,  Chocolate), 

(Amanda,  Chocolate),  (Amanda,  Mint)  }. 

Thus, 

Likes  =  {  (c,  /)  £  Children  x  Flavours  : 

child  c  likes  ice  cream  flavour  /  }. 

Put  differently,  this  relation  is  the  truth  set  of  the  predicate  L  defined  by 
L(c,  /)  =  child  c  likes  ice  cream  flavour  /. 


(^Exercise  7.3)  (Solution  on  page  435) _ 

Referring  to  the  database  of  James  Bond  films  in  Example  7.1,  give  the 
binary  relation  Starsln  C  Names  x  Titles  defined  by 

Starsln  =  {  (x,  y)  :  x  stars  as  James  Bond  in  y}. 


Binary  relations  can  be  visualised  pictorially  by  drawing  arrows  connect¬ 
ing  the  related  objects. 

•  A  heterogeneous  relation  R  C  A  x  B  from  A  to  B  would  most  naturally 
be  depicted  by  drawing  the  two  sets  A  and  B  side-by-side,  and  drawing 
an  arrow  from  each  element  a  e  A  in  the  first  set  to  each  of  those 
elements  b  £  B  to  which  it  is  related;  i.e. ,  such  that  (a,  b )  £  R. 

•  A  homogeneous  relation  R  C  A  x  A  on  A  on  the  other  hand  might 
more  naturally  be  depicted  by  simply  laying  out  the  elements  of  A 
in  some  natural  fashion,  and  drawing  an  arrow  from  a  £  A  to  b  £  A 
whenever  (a,  b )  £  R. 


Example  7.4 


The  relation  Likes  of  Example  7.3  is  pictured  as  follows: 

Likes 
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We  have  an  arrow  from  a  child  c  £  Children  to  a  flavour  /  £  Flavours  whenever 
(c,  /)  £  Likes. 


Example  7.5 


The  subset  relation  C  on  the  powerset  of  {  a,  b  }: 
V({a,  b})  =  |  0,  {a},  {&},  {a,  b}  j 


is  pictured  as  follows: 


a 


{a,  b} 


We  have  an  arrow  from  one  set  A  to  another  set  B  whenever  AC  B. 


Exercise  7.5 J  (Solution  on  page  435) _ 

Referring  to  the  database  of  James  Bond  films  in  Example  7.1,  let 
Bond Actors  C  Names 

be  the  set  of  six  actors  who  have  played  the  role  of  James  Bond,  and  de¬ 
fine  the  two  binary  relations  Before  and  FirstBefore  on  BondActors  as 
follows: 

Before  =  {  (x,  y)  :  x  stars  as  James  Bond  in  an  earlier  film 

than  one  in  which  y  stars  as  James  Bond}; 

FirstBefore  —  {  (x,  y)  :  x  starred  as  James  Bond  before  y  did  }. 

Present  these  relations  pictorially  as  well  as  list  out  their  elements. 

(Be  careful  with  this  exercise.  The  way  that  the  binary  relation  Before 
is  defined  allows  each  of  two  actors  to  appear  before  the  other,  and  for  one 
actor  to  appear  before  himself!) 


Kinship  relations  are  prime  examples  of  binary  relations.  We  all  have 
an  intuitive  grasp  of  these  and  we  can  name  a  wide  range  of  relationships, 
e.g.  father,  mother,  sibling,  great  uncle.  The  English  language  is  not  even 
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particularly  rich  in  this  respect.  In  Swedish,  for  example,  you  don’t  just  refer 
to  your  aunt  or  your  uncle,  but  more  specifically  to  your  farbror  (father’s 
brother),  your  morbror  (mother’s  brother)  your  faster  (father’s  sister),  or 
your  moster  (mother’s  sister). 


The  Duck  family  consists  of  the  parents  Hortense  and  Quackmore  Duck, 
and  their  two  children  Della  and  Donald.  Hortense  has  a  brother  Scrooge, 
and  Della  has  three  sons:  Huey,  Louis  and  Dewey.  Let  us  consider  the  set 
of  these  eight  Ducks: 

Ducks  =  j  Quackmore,  Hortense,  Scrooge, 

Donald,  Della,  Huey,  Louis,  Dewey}. 

There  are  a  variety  of  kinship  relations  defined  over  Ducks  x  Ducks,  such 
as  the  following: 

Father  =  j  (Quackmore,  Donald),  (Quackmore,  Della)  }. 

Mother  =  j  (Hortense,  Donald),  (Hortense,  Della), 

(Della,  Huey),  (Della,  Louis),  (Della,  Dewey)  }. 

Parent  =  j  (Quackmore,  Donald),  (Quackmore,  Della), 

(Hortense,  Donald),  (Hortense,  Della), 

(Della,  Huey),  (Della,  Louis),  (Della,  Dewey)  }. 

Uncle  =  |  (Scrooge,  Donald),  (Scrooge,  Della), 

(Donald,  Huey),  (Donald,  Louis),  (Donald,  Dewey)}. 


Exercise  7.6)  (Solution  on  page  436) 


Define  the  kinship  relations  Child,  Brother,  Sister  and  Sibling  on  the  Duck 
family  of  Example  7.6,  and  present  the  Child  relation  pictorially. 


7.2.1  Functions  as  Binary  Relations 

We  have  defined  a  function  f  :  A  —>  B  to  be  an  assignment  of  exactly  one 
element  of  B  to  each  element  of  A,  and  noted  in  Theorem  6.4  that  such  a 
function  is  completely  determined  by  its  graph: 


graph(/)  =  {(a,  b)  e  A  x  B  :  b  =  f(a)}. 
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The  graph  of  the  function  /  is  a  binary  relation  from  A  to  B  satisfying  the 
following  special  property:  every  element  a  E  A  is  related  to  exactly  one 
element  b  E  B. 

Conversely,  any  binary  relation  R  C  A  x  B  which  satisfies  this  property 
defines  a  function  /fi  :  A  — >  £. 

(^Theorem  7.6y _ 

A  binary  relation  R  C  Ax  B  is  the  graph  of  a  function  from  A  to  B  if,  and 
only  if, 

a  E  A^\b  E  B  ((a,  b )  e  R)  (*) 


Proof:  If  the  relation  R  C  A  x  B  satisfies  the  property  (*),  then  we  can 
define  a  function  fR  :  A  — >  B  by  mapping  each  a  E  A  to  the  unique  b  E  B 
such  that  (a,  b)  E  R.  Clearly,  graph(/K)  =  R,  as  given  any  (a,  b)  E  A  x  B, 

(a,  b)  E  graph(/K)  o  fR(a)  =  b  (by  definition  of  graph(fR)) 

O  (a,  b)  E  R  (by  definition  of  fR). 

Conversely,  if  R  =  graph(/)  for  some  function  /  :  A  — >  B,  then  R  must 
clearly  satisfy  (*),  as  the  graph  of  any  function  must  satisfy  (*).  □ 


(73)  Operations  on  Binary  Relations 

We  have  defined  binary  relations  as  certain  sets;  specifically,  a  binary  relation 
from  A  to  B  is  a  subset  of  Ax  B.  With  this  view  in  mind,  there  are  various 
operations  which  we  can  apply  to  binary  relations  to  extract  information 
from  them,  or  to  build  further  binary  relations,  typical  of  the  sort  employed 
by  database  queries. 

7.3.1  Boolean  Operations 

As  binary  relations  are  sets  (of  pairs),  the  usual  set  operations  can  be  applied 
to  these,  often  quite  usefully.  In  the  above  Duck  family  Example  7.6,  for 
instance,  the  Parent  relation  is  defined  simply  as  the  union  of  the  Father 
and  Mother  relations: 

Parent  =  Father  U  Mother. 

This  is  intuitively  clear,  as  2:  is  a  parent  of  y  if,  and  only  if,  either  x  is  the 
father  of  y,  or  x  is  the  mother  of  y: 

(x,  y)  E  Parent  if,  and  only  if,  (x,  y)  E  Father  or  (x,  y)  E  Mother. 
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We  can  also  express  the  Father  relation  in  terms  of  the  Parent  and 
Mother  relations,  noting  that  a  father  is  someone  who  is  a  parent  but  not  a 
mother: 


Father  =  Parent  \  Mother. 

Note  that  in  order  to  apply  set  operations  to  binary  relations,  the  rela¬ 
tions  being  operated  on  must  be  defined  over  the  same  sets  (in  this  case, 
Ducks  x  Ducks).  It  would  not  make  much  sense,  for  example,  to  take 
the  union  Father  U  Before  of  the  relation  Father  C  Ducks  x  Ducks  from 
Example  7.6.  and  the  relation  Before  C  Names  x  Names  from  Exercise  7.5. 


(^Exercise  7.7^)  (Solution  on  page  437) 


Let  Ri,  R2  and  R3  represent  the  less-than 
=,  and  the  less-than-or-equal-to  relation 
of  natural  numbers: 


R\  {  (x,  y)  e  N2 

:  x  <y}\ 

Ri  =  { (*,  y)  6  n2 

:  x  =  y}\ 

Ri=  { (x,  y)  e  N2 

rA 

VI 

What  are  the  following  relations? 

1 .  R\  U 

2 .  Rs  fl  i?2 

3.  Rs  \  Ri 

relation  <,  the  equality  relation 
<,  respectively,  all  on  the  set  N 


7.3.2  Inverting  Relations 

Given  a  binary  relation,  an  obvious  and  natural  thing  to  do  is  to  turn  it 
around,  or  invert  it,  and  consider  the  converse  relation.  For  example,  the 
opposite,  or  inverse,  of  the  less-than-or-equal-to  relation  <  is  the  greater- 
than-or-equal-to  relation  >  (as  x  <  y  if,  and  only  if,  y  >  a;);  and  the 
opposite,  or  inverse,  of  the  Parent  relation  is  the  Child  relation  (as  a:  is  a 
parent  of  y  if,  and  only  if,  y  is  a  child  of  x). 

Given  a  binary  relation  R  C  A  x  B  from  a  set  A  to  a  set  B,  the  inverse 
relation  R_1  C  B  x  A  from  B  to  A  is  defined  as 

R-1  =  {( b,a )  :  (a,  b)  e  R}. 

If  we  consider  the  pictorial  representation  of  the  relation  R,  we  can  derive 
the  pictorial  representation  of  R-1  simply  by  reversing  the  direction  of  all 
of  the  arrows,  thus  replacing  each  arrow  from  a  to  b  where  (a,  b)  e  R  by  an 
arrow  from  b  to  a. 
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The  inverse  of  the  relation  Likes  C  Children  x  Flavours  from  Example  7.3  is 
the  relation  Likes-1  C  Flavours  x  Children  of  “is  liked  by”: 


Likes  Likes  1 


For  example,  (Joel,  Mint)  E  Likes  indicates  that  Joel  likes  mint  ice  cream, 
while  (Mint,  Joel)  E  Likes-1  indicates  that  mint  ice  cream  is  liked  by  Joel. 


Exercise  7.8  )  (Solution  on  page  437) 


What  is  Sibling  1 ,  the  inverse  of  the  Sibling  relation? 


7.3.3  Composing  Relations 

As  well  as  turn  relations  around,  another  natural  operation  is  to  combine, 
or  compose,  two  relations  by  following  one  with  another.  Given  relations 
R  C  A  x  B  from  A  to  B  and  5  C  B  x  C  from  B  to  C,  the  composition  of 
5  and  R  is  the  relation  5  o  R  C  A  x  C  from  A  to  C  defined  as 

S  o  R  =  {(a,  c)  e  A  x  C  :  3  b  E  B  such  that 

(a,  b)  E  R  and  (b,  c)  E  S}. 

If  we  consider  the  pictorial  representation  of  the  relations  R  and  5,  we  can 
derive  the  pictorial  representation  of  5  o  R  simply  by  following  an  J?-arrow 
by  an  5-arrow,  as  in  the  following  example: 


So  R 


A  C  A  C 
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Note  that  the  target  of  the  relation  R  must  be  the  same  as  the  source  of  the 
relation  5  in  order  to  form  the  composition.  Also  note  carefully  the  order  of 
the  relations:  the  composition  5  o  R  of  the  relations  5  and  R  first  “applies” 
the  relation  R  to  its  source  before  “applying”  the  relation  5  to  the  result.  In 
this  sense,  the  definition  coincides  with  the  composition  of  functions  given 
in  Definition  6.8. 


Example  7.8 J 

A  grandfather  is  a  father  of  a  parent,  and  we  can  use  this  characterisation 
to  define  the  Grandfather  relation: 

Grandfather  =  Parent  o  Father. 

The  order  in  which  we  write  the  two  relations  which  are  being  composed  is 
important.  For  example,  a  grandfather  is  a  father  of  a  parent,  which  is  not 
the  same  thing  as  a  parent  of  a  father: 

Father  a  Parent  ^  Parent  o  Father. 


(^Exercise  7.fT)  (Solution  on  page  437) _ 

Define  the  relations  Uncle  and  Nephew  in  terms  of  simpler  relations,  and 
derive  these  relations  for  the  Duck  family  of  Example  7.6. 


7.3.4  The  Domain  and  Range  of  a  Relation 

Given  the  relation  R  C  A  x  B  from  A  to  B, 

•  the  domain  of  R  is  the  set 

domain(i?)  =  {a  E  A  :  3b  E  B  such  that  (a,  b)  E  R}; 

•  the  range  of  R  is  the  set 

range(.R)  =  {b  E  B  :  3a  E  A  such  that  (a,b)  E  R}. 

That  is  to  say,  the  domain  of  a  relation  consists  of  all  elements  of  the  source 
A  of  the  relation  which  are  related  to  something  in  the  target  B,  and  the 
range  of  a  relation  consists  of  all  elements  of  the  target  B  of  the  relation 
which  are  related  to  something  in  the  source  A. 

(^Example  7.9^) 


Consider  the  following  relations  on  humans  H: 
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Parent  =  {  (x,  y)  :  a;  is  a  parent  of  y  } 
Brother  =  {  (x,  y)  :  a;  is  a  brother  of  y  } 


Then 

domain(Parenf)  =  the  set  of  parents  ( not  all  of  H)\ 
rang e(Parent)  =  the  set  of  children  ( all  of  H); 
domain  (Brother)  =  the  set  of  brothers  (males  with  siblings); 
range(Brother)  =  the  set  of  humans  with  a  brother. 


(^Exercise  7.1[P)  (Solution  on  page  438) _ 

Prove  that  if  R  C  A  x  B  is  the  graph  of  a  function  /  :  A  — »  B,  then 
domain(P)  =  A  (i.e. ,  the  domain  of  /)  and  range(P)  =  range(/). 


(7 A)  Properties  of  Binary  Relations 

There  are  various  properties  that  a  binary  relation  on  a  set  A  may  or  may 
not  satisfy.  Of  particular  interest  are  the  properties  of  reflexivity,  symmetry 
and  transitivity,  all  of  which  we  shall  explore  in  this  section. 

7.4.1  Reflexive  and  Irreflexive  Relations 

The  difference  between  the  less-than  relation  <  and  the  less-than-or-equal- 
to  relation  <  on  numbers  is  that  any  number  is  less-than- or- equal-to  itself 
(since  it  is  equal  to  itself),  but  no  number  is  less-than  itself.  For  example, 
2  <  2  is  true  but  2  <  2  is  not  true.  This  motivates  our  first  property. 

(^Definition  7.1(T) 

A  relation  R  on  a  set  A  is  reflexive  if,  and  only  if,  every  element  of  A  is 
related  to  itself  by  R: 

fix  £  A  ( xRx ). 

The  relation  is  irreflexive  if,  and  only  if,  no  element  of  A  is  related  to  itself: 
f! x  £  A  -i( xRx ). 


Thus,  for  example,  the  less-than-or-equal-to  relation  <  is  reflexive, 
while  the  less-than  relation  <  is  irreflexive.  Note  that  irreflexive  is  not 
the  same  as  non-reflexive:  it  is  possible  for  a  binary  relation  to  relate  some 
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but  not  all  elements  to  themselves,  thus  making  the  relation  neither  reflexive 
nor  irreflexive. 

(^Exercise  7.11/)  (Solution  on  page  438) 

Is  the  relation  Before  from  Exercise  7.5  reflexive,  irreflexive,  or  neither? 
What  about  the  relation  FirstBefore ? 


7.4.2  Symmetric  and  Antisymmetric  Relations 

Equality  between  objects  suggests  -  amongst  other  things  -  a  certain  symme¬ 
try  between  the  objects,  which  is  captured  by  the  next  property  of  interest. 

(^Definition  7.1l) _ 

A  relation  R  on  a  set  A  is  symmetric  if,  and  only  if,  y  is  related  to  x 
whenever  x  is  related  to  y: 

\/x,y  E  A  ( xRy  =?  yRx). 

The  relation  is  antisymmetric  if,  and  only  if,  y  is  never  related  to  x 
whenever  x  is  related  to  y,  except  possibly  for  when  x  =  y: 

V x,y  E  A  ({xRy  A  yRx)  =f>  x  —  y'j. 


Thus,  for  example,  the  relations  <  and  <  are  both  antisymmetric,  while 
the  relation  =  is  symmetric  (as  well  as  anytisymmetric). 


(^Exercise  7.12^)  (Solution  on  page  438) 


Is  the  relation  Before  from  Exercise  7.5  symmetric,  antisymmetric,  or  nei¬ 
ther?  What  about  the  relation  FirstBefore ? 


7.4.3  Transitive  Relations 

If  one  number  is  less  than  a  second  number  which  is  itself  less  than  a  third 
number,  then  clearly  the  first  number  will  also  be  less  than  the  third  number. 
This  property  of  the  less-than  relation  is  embodied  in  the  final  property  of 
interest. 

(^Definition  7.12^ 

A  relation  R  on  a  set  A  is  transitive  if,  and  only  if,  x  is  related  to  z 
whenever  x  is  related  to  some  y  which  is  related  to  z: 

V  x,y,z  E  A  [(xRy  A  yRz)  =f>  xRz^j . 
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Thus,  for  example,  the  relations  <  and  <  are  both  transitive. 

(^Exercise  7.13^)  (Solution  on  page  438) _ 

Is  the  relation  Before  from  Exercise  7.5  transitive?  What  about  the  relation 
FirstBefore ? 


(^Example  7.13^) 

Consider  the  sibling  (brother  or  sister)  relationship  over  people. 

1.  This  is  not  reflexive,  as  you  would  not  consider  someone  to  be  their 
own  sibling.  It  is  in  fact  irreflexive. 

2.  It  is  symmetric  as  anyone  is  obviously  a  sibling  to  each  of  their  siblings. 
Clearly  it  is  not  antisymmetric. 

3.  Finally,  it  is  not  transitive,  as  this  would  imply  that  any  person  who 
has  a  sibling  must  be  a  sibling  of  themselves.  Also,  if  we  allow  half¬ 
siblings,  one  person  may  be  a  sibling  to  a  second  person  due  to  sharing 
a  common  father  whilst  having  different  mothers;  and  the  second  per¬ 
son  may  be  a  sibling  to  yet  a  third  person  due  to  sharing  a  common 
mother  whilst  having  different  fathers.  In  this  scenario,  the  first  and 
third  children  would  not  be  siblings,  as  they  do  not  share  a  common 
parent. 


(^Exercise  7.14)  (Solution  on  page  439) 


Consider  the  relations  is- an- ancestor- of  and  is-married-to  defined  over 
people.  Indicate  whether  these  are  reflexive,  irreflexive,  symmetric,  anti¬ 
symmetric,  and/or  transitive.  Justify  your  answers. 


7.4.4  Orderings  Relations 

Various  common  binary  relations  arrange  the  elements  of  their  domain  into 
some  specific  ordering.  For  example  the  less-than-or-equal-to  relation  < 
orders  the  natural  numbers  into  an  increasing  sequence:  0<1<2<3<---. 
Note  that  this  ordering  is  total  in  the  sense  that  any  two  numbers  a  and  b 
are  related  in  one  way  or  the  other:  either  a  <  b  or  b  <  a. 

Whether  or  not  a  particular  binary  relation  defined  on  a  set  orders  the 
elements  of  that  set  depends  on  whether  or  not  it  satisfies  certain  of  the 
properties  defined  above.  Naturally,  a  less-than-or-equal-to  relation  should 
be: 


Properties  of  Binary  Relations  193 


•  reflexive  -  any  element  should  be  less-than-or-equal-to  itself; 

•  antisymmetric  -  if  a  is  less-than-or-equal-to  b  and  b  is  also  less-than- 
or-equal-to  a,  then  a  and  b  should  be  equal. 

•  transitive  -  if  a  is  less-than-or-equal-to  b  and  b  is  less-than-or-equal- 
to  c,  then  a  should  be  less-than-or-equal-to  c. 

In  fact,  these  three  properties  taken  together  indicate  that  a  relation  is  an 
ordering  relation  as  defined  as  follows. 

(^Definition  7.15^) _ 

A  binary  relation  R  on  a  set  is  a  partial  order  if,  and  only  if,  it  is  reflexive, 
antisymmetric,  and  transitive.  It  is  a  total  order  if,  and  only  if,  it  is  a 
partial  order  in  which  any  two  elements  are  related  in  one  way  or  the  other: 

\/x,y  £  A  ( xRy  V  yRx). 


(^Example  7.15^ 

•  The  equality  relation  =  on  integers  is  a  partial  order,  but  it  is  not  a 
total  order. 

•  The  less-than-or-equal-to  relation  <  on  integers  is  a  total  order.  How¬ 
ever,  the  less-than  relation  <  on  integers  is  not  a  (total  or  partial) 
order,  as  it  is  not  reflexive. 

•  The  subset  relation  C  on  sets  is  a  partial  order  but  not  a  total  order; 
for  example,  {  1  }  2  {  2  }  and  {  2  }  2  {  1  }• 


7.4.5  Equivalence  Relations 

A  binary  relation  on  a  set  may  reflect  a  notion  of  sameness  between  elements 
of  that  set,  defining  when  we  might  want  to  consider  two  elements  of  the 
set  to  be  indistinguishable  -  that  they  are  in  some  sense  equivalent. 

As  with  orderings,  whether  or  not  a  particular  relation  over  a  set  defines 
an  equivalence  between  elements  of  that  set  depends  on  whether  or  not  it 
satisfies  certain  of  the  properties  defined  above.  Naturally,  such  a  relation 
should  be: 

•  reflexive  -  any  element  should  be  the  same  as  itself; 

•  symmetric  -  if  a  is  the  same  as  b  then  b  should  be  the  same  as  a; 

•  transitive  -  if  a  is  the  same  as  b  and  b  is  the  same  as  c,  then  a  should 
be  the  same  as  c. 
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These  three  properties  suffice  to  define  a  notion  of  sameness. 

(^Definition  7.16^) _ 

A  binary  relation  R  on  a  set  is  an  equivalence  relation  if,  and  only  if,  it 
is  reflexive,  symmetric,  and  transitive. 


Example  7.17 j _ 

•  The  equality  relation  =  on  integers  is  an  equivalence. 

•  The  less-than-or-equal-to  relation  <  on  integers  is  not  an  equivalence 
relation,  as  it  is  not  symmetric.  Furthermore,  the  less-than  relation 
<  on  integers  fails  to  be  an  equivalence  relation  for  this  same  reason, 
as  well  as  for  not  being  reflexive. 

•  The  subset  relation  C  on  sets  is  not  an  equivalence  relation,  as  it  is 
not  symmetric. 


(^Example  7.18^) 

Consider  splitting  up  a  set  A  of  people  into  twelve  groups  depending  on  the 
month  of  their  birthday;  for  example,  one  of  the  groups  might  consist  of 
all  those  people  in  A  whose  birthday  is  in  September.  (There  may  actually 
be  fewer  than  twelve  groups,  if  there  are  months  in  which  no  one  in  A  was 
born.)  This  naturally  defines  an  equivalence  relation  R  on  A  in  which  two 
people  are  related  if,  and  only  if,  their  birthdays  are  in  the  same  month: 

R  =  {(x,y)  :  x  and  y  have  birthdays  in  the  same  month  }. 

Clearly  this  relation  is  reflexive,  symmetric  and  transitive. 


(^Exercise  7.18^)  (Solution  on  page  439) _ 

Which  of  the  following  binary  relations  on  N  are  partial  orders?  Which  are 
total  orders?  Which  are  equivalences?  Explain  your  answers. 

1.  The  identity  relation  /  =  {(n,n)  :  raeN}. 

2.  The  universal  relation  U  =  {(m,n)  :  m, neN}. 

3.  The  parity  relation  P  =  {  (to,  n)  :  m=n(mod  2)}. 


(^Exercise  7.19)  (Solution  on  page  439) 


Consider  a  set  5  of  students  who  are  each  taking  some  number  of  courses 
chosen  from  a  set  C  of  courses.  Define  the  following  binary  relations  on  5: 
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Ri  =  {(si,s2)  :  Sj  and  s2  take  all  the  same  courses  }. 

R2  =  {  (si ,  s2)  :  Si  and  s2  take  some  course  together}. 
Are  either  of  these  an  equivalence  relation?  Justify  your  answer. 


7.4.6  Equivalence  Classes  and  Partitions 

Consider  the  equivalence  relation  R  from  Example  7.18  defined  over  some 
set  A  of  people: 

R  =  {(x,y)  :  x  and  y  have  birthdays  in  the  same  month  }. 

We  based  this  equivalence  relation  on  a  partitioning  of  the  set  A  into  disjoint 
sets.  This  idea  is  formalised  in  the  following. 

(^Definition  7.20y _ 

A  partition  of  a  set  A  is  a  collection  {A,  :  i  E  1}  of  disjoint  non-empty 
subsets  of  A  which  together  contain  all  of  A.  That  is: 

1.  Ai  n  A.,  =  0  whenever  j;  and 

2-  Uie/As  =  A. 

The  subsets  A,  are  called  the  blocks  of  the  partition.  We  say  that  one 
partition  is  a  refinement  of  a  second  partition  if,  and  only  if,  every  block  of 
the  first  is  a  subset  of  some  block  of  the  second. 


(^Example  7.2tP) 

We  can  refine  the  relation  R  from  Example  7.18  by  splitting  the  people  of  A 
not  just  according  to  the  month  of  their  birth,  but  according  to  sex  as  well, 
thus  creating  (up  to)  24  groups;  for  example,  one  of  the  groups  might  consist 
of  all  females  in  A  whose  birthday  is  in  September.  This  new  partition  of  A 
is  clearly  a  refinement  of  the  original  coarser  partition  defined  only  by  birth 
month. 


(^Exercise  7.21/)  (Solution  on  page  439) 

What  is  the  finest  partition  of  a  set  A,  in  the  sense  that  it  cannot  be  refined 
into  a  different  partition?  What  is  the  coarsest  (i.e. ,  least  fine)  partition? 


Any  partition  of  a  set  A  naturally  defines  an  equivalence  relation,  in 
just  the  way  the  partition  of  Example  7.18  gave  rise  to  the  equivalence 
relation  R;  two  elements  of  A  will  be  deemed  equivalent  if,  and  only  if,  they 
appear  in  the  same  block  of  the  partition.  Just  as  clearly,  any  equivalence 
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relation  partitions  the  elements  over  which  it  is  defined  into  disjoint  non¬ 
empty  subsets,  called  equivalence  classes. 

(^Definition  7.21^) _ 

Given  an  equivalence  relation  R  on  a  set  A,  the  equivalence  class  of  an 
element  a  of  A  with  respect  to  R,  denoted  [a]B,  is  the  set  of  elements  of  A 
which  are  related  to  a  by  R: 

[a]B  =  {s;Gj4  :  aRx }. 


(^Theorem  7.22^) 

The  collection  of  equivalence  classes  {[a]fi  :  a  e  A}  of  an  equivalence 
relation  R  is  a  partition  of  A. 


Proof:  To  prove  this  we  need  to  show  the  following: 

1.  Each  [a]fl  is  non-empty. 

This  is  true  since  a  e  [a]B. 

2.  The  union  of  the  equivalence  classes  is  A. 

This  is  true  since  each  a  e  A  is  in  the  equivalence  class  [a]fi. 

3.  The  equivalence  classes  are  disjoint;  in  other  words,  two  non-disjoint 
equivalence  classes  must  be  equal. 

To  see  this,  let  us  assume  that  [a]B  and  [&]B  are  not  disjoint,  that 
they  contain  a  common  element  x;  that  is,  aRx  and  bRx,  which  by 
symmetry  means  also  that  xRa,  and  thus  by  transitivity  that  bRa. 
Then 

y  e  [a]fl  o  aRy 

O  bRy  (by  transitivity,  since  bRa  and  aRy) 

&  y  £  [6]ji- 

Thus  we  must  have  that  [a]B  =  [b]R.  □ 


(^Exercise  7.23^)  (Solution  on  page  440) 

What  are  the  equivalence  relations  defined  by  the  finest  and  coarsest  parti¬ 
tions  of  a  set  A  identified  in  Exercise  7.21? 


Exercise  7.24)  (Solution  on  page  440) 


Let  the  relation  R  on  the  set  A  =  {  1,  2,  3, . . . ,  29  }  of  positive  integers  less 
than  30  be  defined  by: 
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(x,  y)  £  R  if,  and  only  if,  x  and  y  have  the  same  prime  factors. 

For  example,  (12,18)  e  R  since  12  =  2x2x3  and  18  =  2x3x3  have  the 
same  prime  factors  2  and  3.  Clearly  this  is  an  equivalence  relation. 

How  many  equivalence  classes  does  R  partition  A  into?  List  each  of  these 
equivalence  classes. 


(7^5)  Additional  Exercises 

1.  Consider  the  following  family  members  of  Don  Vito  Corleone  and  his 
wife  Carmella  have  four  children:  Santino,  Federico,  Michael  and  Con- 
stanzia.  Santino  is  married  to  Sandra  and  they  have  four  children: 
Santino  Jr,  Francesca,  Kathryn  and  Frank.  Michael  is  married  to  Kay 
and  they  have  two  children:  Anthony  and  Mary.  Constanzia  is  mar¬ 
ried  to  Carlo  and  they  have  two  children:  Victor  and  Michael  Francis. 
Federico  is  not  married  and  has  no  children. 

(a)  List  out  the  set  Corleones  of  all  persons  mentioned  above. 

(b)  List  out  the  relations  Father,  Mother,  Husband  and  Sibling. 

(c)  Define  the  relation  Father  in  terms  of  Mother  and  Husband. 

(d)  Define  the  relations  Parent,  Wife  and  Spouse  in  terms  of  the 
above  relations,  and  list  these  out. 

(e)  Define  the  relations  Father-In-Law,  Mother-In-Law  and  Cousin 
in  terms  of  the  above  relations,  and  list  these  out. 

2.  Indicate  which  of  the  following  relations  defined  over  the  integers  7L 
are  reflexive,  which  are  irreflexive,  which  are  symmetric,  which  are 
antisymmetric,  and  which  are  transitive.  Justify  your  answers. 


(a)  R1 

=  {(“,&) 

a  —  b  or  a  =  —  b  } 

(b)  R2 

=  {(“,&) 

a  =  6—1  }. 

(c)  Rs 

=  {(«.  6) 

a+6  <  10}. 

(d)  R, 

=  {(“,&) 

a  <  26}. 

3.  Indicate  which  of  the  following  relations  defined  over  the  positive  in¬ 
tegers  are  reflexive,  which  are  irreflexive,  which  are  symmetric,  which 
are  antisymmetric,  and  which  are  transitive.  Justify  your  answers. 

(a)  The  divisibility  relation  a  \  b  which  holds  if,  and  only  if,  a  divides 
evenly  into  b. 

(b)  The  relatively  prime  relation  which  holds  between  a  and  b  if, 
and  only  if,  their  greatest  common  divisor  is  1. 
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(c)  The  relation  which  holds  between  a  and  b  if,  and  only  if,  their 
difference  (i.e.,  the  larger  minus  the  smaller)  is  divisibly  by  3. 

4.  What  does  a  symmetric  and  transitive  relation  look  like?  Is  it  true 
that  any  binary  relation  which  is  symmetric  and  transitive  must  also 
be  reflexive?  Justify  your  answer. 

5.  Suppose  R  and  5  are  symmetric  relations  on  a  set  A.  Which  of  the 
following  must  be  a  symmetric  relation?  Justify  your  answers. 

(a)  fiuS.  (b)HnS.  (c)  Ro  S.  (d)  R.  (e)  R-1 

6.  Suppose  R  and  5  are  transitive  relations  on  a  set  A.  Which  of  the 
following  must  be  a  transitive  relation?  Justify  your  answers. 

(a)  Ru  S.  (b)  RnS.  (c)  RoS.  (d)  R.  (e)  R-1 

7.  Match  the  property  of  the  binary  relation  R  on  A  listed  on  the  left  to 
a  characterisation  of  that  property  on  the  right: 

1.  reflexive  (a)  R  o  R  C  R 

2.  irreflexive  (b)  id^  n  R  =  0 

3.  symmetric  (c)  R  =  R -1 

4.  antisymmetric  (d)  id^  C  R 

5.  transitive  (e)  R  n  R_1  C  id^ 

8.  The  reflexive  closure  of  a  relation  R  over  a  set  A  is  the  smallest 
reflexive  relation  that  contains  R.  Similarly,  the  symmetric  closure 
of  a  relation  R  over  a  set  A  is  the  smallest  symmetric  relation  that 
contains  R,  and  the  transitive  closure  of  a  relation  R  over  a  set  A 
is  the  smallest  transitive  relation  that  contains  R. 

Compute  the  reflexive,  symmetric  and  transitive  closures  of  the  binary 
relation  R  =  {(0, 1),  (1,  2),  (3, 4),  (4,  3)}  over  the  set  A  =  {0, 1,  2,  3,  4}. 

9.  Prove  that  R  U  {  (a,  a)  :  a  6  R  }  is  the  reflexive  closure  of  R. 

10.  Prove  that  R  U  R_1  is  the  symmetric  closure  of  R. 

11.  Prove  that 

{  (alt  an)  :  3 a2,  a3, . . . ,  a„_j  such  that  (a,,  a!+1)  e  R 

for  each  4  =  1,2,...,  n—1 } 

is  the  transitive  closure  of  R. 

12.  Let  us  say  that  two  real  numbers  x  and  y  are  approximately  equal, 
and  write  x  «  y,  if,  and  only  if,  they  differ  by  no  more  than  1/1000. 
Thus,  the  relation  «  on  R  is  defined  as  follows: 

«  =  {(z,2 /)  :  |x  -  2/|  <  1/1000}. 
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Intuitively  this  ought  to  be  an  equivalence  relation.  Explain  why  this 
relation  is  -  or  is  not  -  reflexive,  symmetric  and  transitive. 

13.  Consider  the  relation  <  defined  on  a  Boolean  algebra  B  as  follows:  for 
all  x,  y  e  B,  x  <  y  if,  and  only  if,  x  +  y  =  y. 

(a)  Prove  that  <  is  a  partial  order. 

(b)  What  does  <  correspond  to  in  the  Boolean  algebra  of  sets? 

(c)  What  does  <  correspond  to  in  the  Boolean  algebra  of  proposi¬ 
tions? 

14.  Assuming  that  R  is  an  equivalence  relation  on  A,  show  directly  from 
the  definitions  that  the  following  statements  about  two  elements  a  and 
b  of  A  are  equivalent: 

(a)  aRb  (b)  [a]fl  =  [6]fl  (c)  [a]fl  n  [b]R  ^  0 


Chapter  8 

Inductive  and  Recursive 
Definitions 


Great  fleas  have  little  fleas, 

Upon  their  backs  to  bite  ’em, 

And  little  fleas  have  lesser  fleas, 

And  so  ad  infinitum. 

-  Augustus  De  Morgan. 

Most  of  the  objects  under  study  within  Computer  Science  are  defined  induc¬ 
tively:  that  is,  they  are  defined  in  terms  of  smaller  instances  of  themselves. 
Numbers,  lists,  binary  trees,  and  even  computer  programs  themselves,  are 
all  built  up  from  smaller  objects  of  the  same  type.  For  example,  two  com¬ 
puter  programs  stuck  together,  typically  with  a  semicolon  between  them, 
so  that  the  second  is  executed  once  the  first  completes  its  task  is  nothing 
more  than  a  program  defined  in  terms  of  two  smaller  programs.  Also,  func¬ 
tions  defined  over  such  objects  are  typically  given  by  inductive  definitions, 
whereby  the  value  of  the  function  on  an  inductively-defined  object  is  defined 
by  the  value  of  the  function  on  smaller  objects.  More  generally,  a  recursive 
definition  allows  a  function  to  be  defined  in  terms  of  its  value  on  arbitrary 
objects,  not  necessarily  smaller  objects,  and  can  be  meaningfully  employed. 

Understanding  inductively-defined  objects,  and  the  functions  defined  on 
them,  will  naturally  rely  on  understanding  the  inductive  nature  of  such  defi¬ 
nitions.  In  this  chapter,  we  explore  such  inductive  definitions  and  recursively- 
defined  functions. 


Inductively-Defined  Sets 


As  we  saw,  we  can  define  finite  sets  by  simply  listing  their  elements,  such  as 
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BinaryDigits  =  {0, 1} 

DecimalDigits  =  {0, 1,  2,  3,  4,  5,  6,  7,  8,  9} 

Letters  =  {  a,  b,  c,  d,  e,  f,  g,  h,  i,  j,  k,  1,  m, 
n,  o,  p,  q,  r,  s,  t,  u,  v,  w,  x,  y,  z  } 

Children  =  {Joel,  Felix,  Oskar,  Amanda} 

However,  for  infinite  sets  we  have  had  to  resort  to  using  some  (implicit  or 
explicit)  rule  for  generating  their  members.  For  example,  the  set  of  natural 
numbers 


N  =  {0,1,  2,  3,...} 

which  we  defined  (informally)  in  Chapter  2  relies  on  our  ability  as  intelligent 
beings  to  extract  the  implicit  rule  hinted  at  by  the  ellipses  which  says  that 
adding  one  to  any  element  of  this  set  gives  the  “next”  element  in  the  set. 
However,  this  approach  to  defining  sets  is  fraught  with  complications. 

1.  How  can  we  expect  a  non-intelligent  entity  (such  as  a  computer)  to 
be  able  to  understand  such  a  definition?  At  the  very  least  we  would 
somehow  have  to  make  explicit  the  rule  for  generating  the  elements  of 
the  set. 

2.  How  can  we  even  be  certain  of  the  implicit  rule  underlying  the  defining 
equation?  For  example,  the  author  of  the  above  definition  may  intend 
N  to  represent  the  decimal  digits  (and  thus  end  at  the  digit  9),  or  the 
roots  (i.e. ,  solutions)  of  the  equation  a:4  — 6a;3  +  11a;2  — 6a;  =  0  (in  which 
case  N  would  contain  only  the  four  values  listed). 

3.  The  order  in  which  we  list  the  elements  of  a  set  is  irrelevant,  so  what 
sense  does  it  make  to  refer  to  the  “next”  element  in  a  set? 

4.  How  can  we  determine  when  some  object,  V9  say,  is  in  the  set  we  are 
defining  while  another  object,  \/l0  say,  is  not? 

One  easy  way  of  defining  an  infinite  collection  of  objects  is  to  provide 
a  method  for  generating  new  elements  from  existing  ones.  This  idea  is 
encompassed  by  the  following  definition. 

(^Definition  8.l) _ 

An  inductive  definition  of  a  set  has  three  components. 

1.  The  basis  clause,  which  establishes  that  certain  objects  are  in  the 
set.  These  elements  constitute  the  "building  blocks”  for  constructing 
further  elements  in  the  set. 

2.  The  inductive  clause,  which  defines  the  ways  in  which  elements  of  a 
set  can  be  used  to  produce  further  elements  which  are  also  in  the  set. 
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3.  The  extremal  clause,  which  asserts  that  no  object  is  an  element  of 
the  set  being  defined  unless  its  membership  can  be  established  from  the 
first  two  clauses.  In  other  words,  the  set  being  defined  is  the  smallest 
set  which  satisfies  the  first  two  clauses. 


We  can  represent  precisely  the  set  N  of  natural  numbers  by  way  of  the 
following  inductive  definition. 


1.  0  G  N. 

2.  (n+1)  G  N  whenever  n  G  N. 

In  other  words,  n  G  N  =>  (n+1)  G  N. 

3.  Nothing  else  is  in  N.  That  is,  nothing  is  in  N  unless  it  can  be  con¬ 
structed  from  the  first  two  clauses. 

In  other  words,  N  is  the  smallest  set  satisfying  the  first  two  clauses. 

The  basis  clause  declares  the  number  0  as  a  basic  element  of  the  set  N;  and 
the  inductive  clause  says  that  given  a  natural  number  n,  we  can  produce 
another  natural  number  n+1  by  adding  1  to  the  given  number  n.  In  this 
way  we  can  conclude  that  \/9  =  3  is  an  element  of  N,  since  0  is  an  element  of 
N  (by  the  basis  clause),  and  hence  0+1  =  1  is  an  element  (by  the  inductive 
clause),  and  hence  1+1  =  2  is  an  element  (again  by  the  inductive  clause), 
and  thus  finally  2+1  =  3  is  an  element  (by  a  further  use  of  the  inductive 
clause). 

The  extremal  clause  tells  us  that  an  element  of  N  has  to  be  either  0 
(from  the  basis  clause)  or  the  successor  of  another  element  of  N  (from  the 
inductive  clause).  We  could  not  infer  that  y/10  «  3.16  is  an  element  of  N, 
as  there  is  no  way  to  construct  y/10  from  these  basis  and  inductive  clauses: 
V10  is  clearly  not  0;  and  no  matter  how  many  times  we  add  1  to  0  we  will 
never  generate  the  value  i/io.  Hence  we  must  conclude  that  \/l0  is  not  an 
element  of  N  as  defined. 

Alternatively,  we  can  easily  see  that  the  set  {0,  1,  2,  3,  4,  ...  }  satisfies 
clauses  (1)  and  (2)  of  the  definition.  Therefore,  since  N  is  being  defined  to 
be  the  smallest  set  satisfying  these  clauses,  N  must  be  a  subset  of  this;  since 
this  set  does  not  contain  VlO,  \/l0  £  N. 


Exercise  8.1 )  (Solution  on  page  440) 


Explain,  using  this  inductive  definition  of  N,  why  4  G  N  while  4.5  ^  N. 


We  can  inductively  define  the  set 
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Odd  =  {1,3,  5,  7,...} 

of  odd  natural  numbers  as  the  smallest  set  satisfying  the  following: 

1.  1  e  Odd. 

2.  If  n  e  Odd  then  (n+2)  e  Odd. 

Note  that  in  this  example,  we  incorporated  the  extremal  clause  into  the 
preamble  of  the  definition,  by  defining  the  set  to  be  the  smallest  set  satisfying 
the  basis  and  inductive  clauses;  being  the  smallest  such  set,  only  those 
elements  which  must  be  in  the  set  due  to  the  basis  and  inductive  clauses 
are  actually  members.  We  could  have  instead  included  the  extremal  clause; 
however,  the  above  is  a  common  useful  abbreviated  form. 


(^Exercise  8.2)  (Solution  on  page  441) _ 

The  set  N  satisfies  the  two  clauses  in  the  definition  of  Odd;  that  is,  it 
contains  1,  and  it  contains  (n+2)  whenever  it  contains  n.  Why  does  this 
not  imply  that  Odd  =  N? 


(^Exercise  8.3^)  (Solution  on  page  441) _ 

Give  an  inductive  definition  for  the  set  Powers-of-2  of  powers  of  2, 
Powers-of-2  =  {1,2,4,8,16,32,64,...}. 


Example 


Given  a  finite  set  5,  we  can  define  the  powerset  V(S)  of  5  inductively  as 
the  smallest  set  satisfying  the  following: 


1.  0  £  V(S). 

2.  If  X  e  V{S)  and  a  e  S  then  X  U  {a}  e  V{S). 

For  example,  if  5  =  {1,2,3},  then  by  the  basis  clause  0  e  V(S ),  and  by 
one  application  of  the  inductive  clause  we  get  that  the  following  sets  are  in 
V(S): 


0  U  {1}  =  {1}  0  U  {2}  =  {2}  0  U  {3}  =  {3} 

This  application  reveals  that  all  of  the  singleton  sets  {1},  {2}  and  {3}  are  in 
V(S).  A  second  application  of  the  inductive  clause  tells  us  that  the  following 
sets  are  in  P(S): 
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0U{1}  ={1} 
={1} 

{2}U{1}  ={1,  2} 
{3}  U  {1}  =  {1,  3} 


0  U  {2}  =  {2} 
{1}  U  {2}  ={1,  2} 
{2}  U  {2}  =  {2} 
{3}  U  {2}  =  {2,  3} 


0  U  {3}  =  {3} 
{1}  U  {3}  ={1,  3} 
{2}  U  {3}  =  {2,  3} 
{3}  U  {3}  =  {3} 


This  second  application  reveals  that  all  of  the  two-element  sets  {1,  2},  {1,  3} 
and  {2,  3}  are  also  in  V(S).  A  third  application  of  the  inductive  clause  would 
reveal  that,  apart  from  the  above  sets,  the  three-element  set  5  =  {1,  2,  3} 
itself  is  in  V(S).  Further  applications  of  the  inductive  clause  would  generate 
no  new  elements. 


(^Exercise  8.4J)  (Solution  on  page  441) 


Why  can  the  above  definition  not  be  applied  to  infinite  sets?  (Hint:  Why 
would  this  definition  not  provide  Odd  e  'P(N),  where  Odd  is  as  defined  in 
Example  8.2?) 


8.2)  Inductively-Defined  Syntactic  Sets 

The  elements  of  the  set  N  of  natural  numbers  as  defined  above  are  semantic 
values,  not  syntactic  objects.  To  understand  the  distinction  clearly,  if  we 
define  the  set 

Children  =  {Joel,  Felix,  Oskar,  Amanda} 

we  have  to  make  clear  whether  we  mean  the  set  of  four  names,  or  the  col¬ 
lection  of  people  which  make  up  four  specific  children.  Each  name  in  the 
list  is  merely  a  syntactic  object  unless  we  assign  some  meaning  or  semantic 
content  to  it. 

In  the  same  way,  we  have  that  V9  is  an  element  of  N,  as  V9  =  3  and 
3  is  an  element  of  N.  The  set  N  represents  the  collection  of  values  making 
up  the  natural  numbers,  not  some  arbitrary  representation  of  them  such  as 
decimal  numbers  (sequences  of  decimal  digits)  or  binary  numbers  (sequences 
of  binary  digits). 

To  define  sets  of  such  syntactic  objects,  we  first  introduce  some  termi¬ 
nology.  An  alphabet  is  a  finite  set  of  symbols  or  characters.  A  finite 
sequence  of  characters  from  an  alphabet  A  is  called  a  string  or  word  over 
A.  The  length  of  a  word  w  =  a1a2a3  ■  ■  ■  an,  where  n  G  N  and  a;  e  A  for 
each  1  <  i  <  n,  is  given  by  the  number  n  of  (occurrences  of)  characters 
in  w.  We  shall  use  the  special  symbol  e  (which  cannot  be  a  character  of  the 
alphabet  A)  to  denote  the  empty  word,  that  is,  the  only  word  of  length  0. 
Note  that  ew  =  we  =  w  for  any  word  w. 
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Finally,  we  shall  use  A*  to  denote  the  set  of  all  words  over  A,  and  A+ 
to  denote  the  set  of  non-empty  words  over  A.  We  can  define  these  two  sets 
inductively  as  follows. 

(^Definition  8.4) _ 

The  set  A*  of  words  over  alphabet  A  is  the  smallest  set  satisfying  the  fol¬ 
lowing: 

1. ee  A";  and 

2.  if  w  e  A*  and  a  E  A  then  aw  E  A*. 

The  set  A+  of  non-empty  words  over  alphabet  A  is  the  smallest  set  satisfying 
the  following: 

1.  a  E  A+  for  each  a  E  A;  and 

2.  if  w  E  A+  and  a  E  A  then  aw  E  A+. 


Example  8AJ _ 

If  A  =  {a,  b},  then  A *  is  the  set  consisting  of  all  sequences  of  a’s  and  b’s, 
including  the  empty  sequence  containing  no  characters: 

A*  =  {e,  a,  b,  aa,  ab,  ba,  bb,  aaa,  aab,  aba,  abb,  . . .  }. 

This  is  since: 

•  by  the  first  (basis)  clause,  eGi*; 

•  by  the  second  (inductive)  clause,  adding  either  an  a  or  a  b  to  the  front 
of  any  word  in  A*  gives  a  word  in  A*,  and  as  we  know  e  E  A*,  this 
means  that  {  a,  b}  C  A*; 

•  but  then  by  the  second  (inductive)  clause,  since  we  now  know  that 
{  e,  a,  b  }  C  A*,  we  can  infer  that  {a,  b,  aa,  ab,  ba,  bb}  C  A* \ 

•  by  a  third  application  of  the  second  (inductive)  clause,  we  can  now 
infer  that 

{  a,  b,  aa,  ab,  ba,  bb, 

aaa,  aab,  aba,  abb,  baa,  bab,  bba,  bbb}  C  A"; 

and  each  new  application  of  the  second  (inductive)  clause  adds  more  new 
strings  to  the  set. 

Similarly,  A+  is  the  set  consisting  of  all  non-empty  sequences  of  a’s  and 
b’s: 


A+  —  {a,  b,  aa,  ab,  ba,  bb,  aaa,  aab,  aba,  abb,  . . .  }. 
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We  could  have  defined  the  sets  A*  and  A+  in  various  other  equivalent 
ways.  For  example,  we  could  have  used  wa  instead  of  (or  as  well  as)  aw  in 
each  of  the  second  (inductive)  clauses;  or  we  could  have  provided  just  one 
inductive  definition  and  defined  the  second  set  directly  in  terms  of  the  first, 
by  observing  that  A*  =  A+  U  {  e  }  and  A+  =  A*  \  {  e  }. 

We  can  now  define  the  sets  of  decimal  and  binary  numbers  as  the  sets  of 
non-empty  words  over  decimal,  respectively  binary,  digits. 

DecimalNumbers  =  DecimalDigits+ 

BinaryNumbers  =  BinaryDigits+ 

(^Exercise  8.5^)  (Solution  on  page  441) 

Give  an  inductive  definition  of  PosDecimalNumbers,  the  set  of  positive 
decimal  numbers.  Such  numbers  should  not  have  leading  zeros;  that  is, 
35  e  PosDecimalNumbers  but  035  £  PosDecimalNumbers. 


Backus-Naur  Form 


A  common  style  of  presenting  an  inductive  definition  of  a  set  of  syntactic 
objects  is  the  so-called  Backus-Naur  Form  (BNF),  in  which  the  syntac¬ 
tic  forms  are  presented  equationally.  For  example,  the  set  A*  of  words  over 
A  is  given  by  the  BNF  equation 

w  ::=  e  |  aw 

and  the  set  A+  of  non-empty  words  over  A  is  given  by  the  BNF  equation 
w  ::=  a  \  aw 

where  in  both  cases  a  is  taken  to  range  over  the  alphabet  A.  In  this  way, 
BNF  provides  a  short-hand  form  of  writing  out  inductive  definitions. 

As  another  example,  the  natural  numbers  N  were  defined  in  terms  of 
zero  0  and  the  successor  function  s(n)  —  n+1.  These  elements  can  be 
specified  by  the  BNF  equation 

n  ::=  0  |  s(n). 

Hence,  for  example,  the  number  4  is  formally  defined  as  s(s(s(s(0)))). 

Inductive  definitions  of  sets  of  syntactic  expressions  are  very  common 
in  Computer  Science.  Indeed  we  have  seen  several  already,  such  as  the 
set  of  propositional  formulae,  which  we  can  now  define  formally  as  follows. 
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The  set  of  propositional  formulae  can  be  defined  inductively  as  the  smallest 
set  satisfying  the  following: 


1.  true  and  false  are  propositional  formulae,  as  is  every  propositional  vari¬ 
able  P. 

2.  If  p  and  q  are  propositional  formulae  then  so  are  —ip,  pV q,  p Aq,  p  =f>  q 
and  p  44  q. 

More  succinctly,  the  following  is  a  BNF  equation  for  propositional  formulae. 

p,  q  ::=  true  |  false  |  P  \  -> p  |  pV  q  \  pAq  \  p^q  \  p  44  q 
Here,  P  is  taken  to  range  over  the  set  of  propositional  variables. 


Exercise  8.6  )  (Solution  on  page  441) 


Give  an  inductive  definition  of  the  set  of  formulae  of  predicate  logic. 


BNF  notation  was  invented  in  1959  by  John  Backus  (and  later  simplified 
by  Peter  Naur)  to  define  the  syntax  of  the  ALGOL  programming  language. 
It  then  became  a  common  feature  of  the  appendix  to  programming  language 
reference  books.  This  is  due  to  the  fact  that  the  set  of  programs  which  can 
be  written  in  a  given  programming  language  can  be  defined  inductively  from 
the  constructs  of  the  language. 


The  following  BNF  equation  describes  a  very  simple  programming  language. 


p  ::=  x:=e  \  Pi',P2  |  if  b  then  pi  else  p2  |  while  b  do  p 
For  readability,  this  is  typically  rendered  in  list  fashion  as  follows: 


p  ::=  x:=e 
I  Pi;P2 

|  if  b  then  pi  else  p2 
|  while  b  do  p 

In  the  above,  x  is  taken  to  range  over  program  variables;  and  e  and  b  range 
over  integer  expressions  and  Boolean  expressions,  respectively,  which  them¬ 
selves  will  similarly  be  defined  inductively.  Thus  a  program  in  this  program¬ 
ming  language  is  either 

•  an  assignment  statement  “x :  =e”  which  evaluates  the  integer  expres¬ 
sion  e  and  assigns  this  value  to  the  variable  x\  or 
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•  the  sequential  composition  “pi;p2"  of  two  (smaller)  programs  px  andp2) 
which  first  executes  the  program  plt  and  then  executes  the  program 
p2  if  and  when  program  p\  has  terminated;  or 

•  a  conditional  statement  “if  b  then  pi  else  p2"  involving  a  Boolean 
test  b  and  two  (smaller)  programs  pt  and  p2,  which  first  evaluates  the 
Boolean  expression  b,  and  then  either  executes  the  program  if  b 
evaluated  to  true,  or  executes  the  program  p2  if  b  evaluated  to  false; 
or 

•  a  while  loop  “while  b  do  p”  involving  a  Boolean  test  b  and  a  (smaller) 
program  p,  which  repeatedly  executes  the  program  p  for  as  long  as  the 
Boolean  test  b  is  true;  that  is,  it  first  evaluates  the  Boolean  expression 
b,  and  then  either  terminates  if  b  evaluated  to  false,  or  executes  the 
program  p  and  repeats  itself  (starting  with  re-evaluating  the  Boolean 
expression  b )  if  b  evaluated  to  true. 

We  shall  include  one  further  minor  -  yet  essential  -  piece  of  syntax  in 
this  language:  we  will  allow  ourselves  to  add  braces  around  any  program, 
thus  writing  {p},  in  order  to  avoid  ambiguity.  This  is  illustrated  in  the 
following  example. 

The  following  is  a  program  in  this  language  for  computing  the  sum  of 
the  first  n  positive  integers:  s  =  l  +  2  +  3+  --  -  +  n. 


i  :=  0; 

s  :=  0; 

while  i< 

n  do 

{  i  =- 

i  +  1; 

s  :  = 

s  +  i  } 

This  5-line  program  consists  of  two  smaller  programs  combined  with  the 
sequential  composition  symbol: 


The  first  of  these  programs  is  a  simple  assignment  statement,  while  the 
second  program  is  itself  built  up  from  two  even  smaller  programs  combined 
with  the  sequential  composition  symbol: 
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Again,  the  first  of  these  programs  is  a  simple  assignment  statement,  while  the 
second  program  is  a  while  loop,  the  body  of  which  is  a  program  consisting  of 
two  simple  assignment  statements  combined  with  the  sequential  composition 
symbol.  The  whole  program  thus  breaks  down  as  follows: 


i  :=  0 


s  :  = 

while  i <  n  do 

{ 

|  i  :=  i  +  1  ; 

|  s  :=  s  +  i 

} 

It  is  possible  to  interpret  this  program  differently,  namely  as  two  programs 
combined  with  the  sequential  composition  symbol,  the  first  being  itself  two 
simple  assignment  statements  composed  together  sequentially,  and  the  sec¬ 
ond  being  the  while  loop.  The  break  down  would  then  look  as  follows. 


i  :  =  0 
s  :  =  0 


1  while  i <  n  do 

{ 

i  :=  i  +  1  ; 

s  :  =  s  +  i 

} 

This  particular  ambiguity  is  harmless.  However,  the  potential  for  dangerous 
ambiguity  is  why  the  program  includes  braces  around  the  body  of  the  while 
loop.  Without  these,  it  would  be  possible  (and  moreover  likely)  that  the 
program  would  be  interpreted  wrongly  as  follows. 


o 

II 

•H 

U1 

II 

O 

— 

while  i <  n  do 

i  :=  i  +  1 

s  :  =  s  +  i  | 

- 1 

This  program  -  or  rather  this  interpretation  of  the  program  -  would  return 
the  incorrect  result  s  =  n,  as  the  while  loop  would  do  nothing  but  increment 
the  counter  i  until  it  reached  this  value. 


Inductively-Defined  Data  Types 


Most  data  types  used  in  computer  programming  languages  are  inductively 
defined,  either  by  the  compiler  (the  integers,  for  example)  or  by  the  pro- 
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grammer.  For  example,  a  list  of  natural  numbers  can  be  defined  by  the 
following  BNF  equation. 

L  ::  =  [  ]  |  n  :  L 


In  this  definition,  n  ranges  over  natural  numbers,  and  the  colon  symbol 
represents  the  operation  of  adding  an  element  to  the  front  of  a  list. 
Thus,  a  list  is  either  the  empty  list  [  ]  (the  list  containing  no  items),  or 
a  list  obtained  by  adding  a  natural  number  n  to  the  head  of  a  (smaller) 
list  L.  For  example,  the  list  [1,  2,  3]  is  built  up  inductively  starting  from 
i  ]  as  1  :  2  :  3  :  [  ].  For  clarity  this  could  be  written  using  parentheses  as 
1  ;  (2:  (3:  [])). 

Of  course,  we  could  choose  any  other  type  of  data  to  form  a  list  over; 
e.g.,  a  list  of  names  is  defined  as  above  but  by  letting  n  range  over  names 
rather  than  numbers. 

As  a  further  example,  the  binary  tree  is  a  widely  used  data  structure, 
and  can  be  defined  inductively  as  follows. 


(^Example  8. 


We  may  inductively  define  binary  trees  using  the  following  BNF  equation. 


t  *  |  N(ti,t2) 


That  is,  a  tree  is  either  a  leaf  *  or  an  in¬ 
ternal  node  N(ti,t2)  with  two  subtrees 
£i  and  t2.  For  example,  the  tree 

*)),*)) 

may  be  represented  by  the  picture  shown. 


This  binary  tree  definition  only  provides  the  structure  of  the  data  struc¬ 
ture,  but  you  typically  want  to  store  data  in  data  structures.  For  example,  a 
dictionary  might  be  represented  by  a  binary  tree  with  names  stored  in  the 
(internal)  nodes,  with  the  intention  that  all  names  stored  in  the  left  subtree 
precede  (alphabetically)  the  name  stored  in  the  parent  node,  and  all  names 
stored  in  the  right  subtree  follow  the  name  stored  in  the  parent  node.  For 
example,  valid  dictionaries  for  storing  the  list  of  names 

{Joel,  Felix,  Oskar,  Amanda} 


may  be  given  by  either  of  the  following  trees: 
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Felix  Amanda 


Amanda  Oskar  *  Felix 


Give  an  inductive  definition  for  the  the  dictionary  data  structure  outlined 
above.  Note  that  the  data  structure  would  only  define  the  syntactic  struc¬ 
ture;  the  fact  that  the  names  are  stored  in  proper  lexicographic  order  is  a 
semantic  issue  which  will  not  be  reflected  in  the  definition. 


8.5}  Inductively-Defined  Functions 

We  can  exploit  the  inductive  definition  of  a  set  to  provide  convenient  defi¬ 
nitions  for  functions  on  that  set.  The  function  is  defined  by  specifying  its 
values  on  the  basic  elements  of  the  set,  and  then  specifying  its  values  on  the 
inductively-defined  elements  in  terms  of  its  previously-defined  values. 

For  example,  an  infinite  sequence 

Go,  al>  a2,  G 3,  G4,  05,  •  •  • 

is  provided  by  a  function  whose  domain  is  N,  and  can  often  be  defined  by 
specifying  the  initial  value  a0  and  each  subsequent  value  an  in  terms  of  the 
values  <ik  for  k  <  n. 


The  factorial  function  n\  is  defined  to  be  the  product  of  the  integers 
from  1  to  n: 


n!  =  Ix2x3x  xn. 

More  formally,  it  can  be  defined  inductively  as  follows. 
0!  =  1;  and 

n!  =  n  x  (n— 1)!  (for  n  >  0). 


Thus,  for  example, 
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5!  =  5x4! 

=  5  x  (4  x  3!) 

=  5  x  4  x  (3  x  2!) 

=  5  x  4  x  3  x  (2  x  1!) 

=  5  x  4  x  3  x  2  x  (1  x  0!) 
=  5x4x3x2xlxl 
=  120 


(^Exercise  8.8^)  (Solution  on  page  442) _ 

Compute  the  first  few  values  of  the  sequence  s„  defined  inductively  by: 

So  —  0 

s ji  —  Sji — i  ~\~  2n  1 

Can  you  recognise  this  sequence  as  a  function  of  n? 

(^Example  8.*T) 

The  harmonic  numbers  Hn  are  informally  defined  by 


and  can  be  defined  inductively  as  follows. 
H0  =  0;  and 

Hn  =  Hn_ i  +  1  (for  n  >  0). 


(^Exercise  8.9J  (Solution  on  page  442) _ 

Compute  the  harmonic  number  H6  from  its  inductive  definition. 

(^Example  8.1(T) _ 

The  Fibonacci  numbers  are  defined  inductively  as  follows. 
fo  =  0; 

/i  =  1;  and 

fn  =  fn-l  +  fn—2  (for  U  >  l). 

That  is,  each  number  in  this  sequence  is  obtained  by  adding  together  the 
previous  two  numbers  in  the  sequence.  The  first  few  Fibonacci  numbers  are 
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0,  1,  1,  2,  3,  5,  8,  13,  21,  34,  55,  89,  144,  233,  . . .. 


This  sequence  derives  its  name  from  the  Italian  mathematician  Leonardo 
of  Pisa,  more  commonly  known  by  his  nickname  Fibonacci.  Fibonacci  was 
instrumental  in  spreading  the  use  of  the  modern  Hindu- Arabic  numeral  sys¬ 
tem  to  Europe,  as  an  alternative  to  Roman  numerals,  through  his  book  on 
arithmetic  Liber  Abaci  (The  Book  of  Calculation),  which  was  published 
in  the  early  13th  century.  The  Fibonacci  numbers  appear  in  the  solution  of 
the  following  problem  posed  in  this  book. 


Exercise  8.10J  (Solution  on  page  442) 

Suppose  you  have  a  pair  of  new-born  rabbits  at  the  start  of  month  1,  and 
that  each  pair  of  rabbits  produces  a  new  pair  of  rabbits  after  2  months  and 
each  month  thereafter.  How  many  pairs  of  rabbits  will  you  have  at  the  start 
of  the  nth  month?  (Work  out  the  first  few  months  and  look  for  a  pattern.) 


It  is  worth  looking  more  carefully  at  the  above  inductive  definitions  of 
sequences.  As  the  natural  numbers  N  are  defined  inductively  in  terms  of 
zero  0  and  the  successor  function  s(n)  =  n+1,  functions  over  them  are 
naturally  defined  inductively.  The  above  sequences  are  simple  examples, 
but  induction  can  be  used  to  define  more  complicated  functions  than  just 
sequences. 

(^Example  8.11?) 

By  resorting  to  the  inductive  definition  of  the  natural  numbers 
n  ::=  0  |  s(n). 

as  given  on  page  207,  we  can  inductively  define  the  function 
add  :  N  x  N  — >  N 

which  adds  two  numbers  as  follows: 
add(m,  0)  =  m;  and 
add(m,s(n))  =  s(add(m,n)). 

The  first  clause  merely  states  that  m+0  =  m;  and  the  second,  inductive, 
clause  is  the  precise  way  of  writing  what  we  would  more  naturally  write  as: 

add(m,  n+1)  =  add(m,  n)  +  1. 


Thus,  for  example, 
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add(3, 2)  =  add(3, 1)  +  1 

=  add(3,0)  +  1  +  1 
=  3+1  +  1 


=  5. 


(^Exercise  8.11^)  (Solution  on  page  443) _ 

Give  an  inductive  definition  of  the  function 
mult  :  N  x  N  — >  N 

which  multiplies  two  numbers,  in  terms  of  zero  and  the  successor  function, 
as  well  as  the  function  add  defined  above. 

We  can,  of  course,  define  functions  inductively  over  any  inductively- 
defined  set.  The  inductive  function  definitions  will  naturally  follow  the 
structure  of  the  inductive  definitions  of  the  domain. 

^Example  8.12^) 

The  length  of  a  word  w  G  A*  can  be  defined  inductively  as  follows. 
length  (e)  =  0 

length(aw)  =  1  +  length (w)  (for  a  G  A). 

The  length  of  a  list  of  natural  numbers  can  be  defined  inductively  as  follows. 
length([  ])  =  0 

length (n  :  L)  —  1  +  length (L)  (for  ngNJ. 

The  height  of  a  binary  tree  can  be  defined  inductively  as  follows, 
height  (*)  =  0 

height(lV(fi,  t2))  =  1  +  max  (height(<i),  height(t2))- 


Exercise  8.12 J  (Solution  on  page  443) _ 

Give  an  inductive  definition  of  the  function  sum(L)  which  computes  the 
sum  of  a  list  L  of  numbers.  Use  it  to  verify  that  sum([ 6,  2,  5])  =  13. 


^Exercise  8.13^)  (Solution  on  page  443) 

The  append  function  Lx  ++  L2  joins  two  lists  Lj  and  L2  together.  For 
example,  [1,  2]  ++  [3,  5,  7]  =  [1,  2,  3,  5,  7].  Give  an  inductive  definition  of  the 
append  function. 
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(^Exercise  8.14)  (Solution  on  page  443) 

Referring  to  the  inductive  definition  for  formulae  of  predicate  logic  given  for 
Exercise  8.6  (page  208),  give  an  inductive  definition  for  a  function  which 
takes  a  formula  of  predicate  logic  and  returns  the  set  of  variables  which 
appear  free  in  that  formula. 


Recursive  Functions 


In  each  of  the  functions  defined  in  the  previous  section,  the  value  of  the 
function  on  a  given  argument  is  defined  either  directly,  or  in  terms  of  its 
values  on  smaller  arguments.  In  particular,  for  functions  defined  over  N  the 
value  of  the  function  on  the  argument  0  is  defined  directly,  as  there  are  no 
natural  numbers  k< 0. 

Such  inductively-defined  functions  are  examples  of  recursive  func¬ 
tions,  which  merely  means  that  the  value  of  a  function  applied  to  a  given 
argument  is  expressed  in  terms  of  the  value  of  that  function  applied  to 
other  -  not  necessarily  smaller  -  arguments.  Such  definitions  may  not  be 
well-founded,  though.  For  example,  it  would  not  make  sense  to  define  a 
function  by  f(n )  =  f(n  +  1)  +  1;  in  this  case,  we’d  be  forever  lost  trying  to 
compute  /( 0)  =  /( 1)  +  1  =  /( 2)  +  2  =  /( 3)  +  3  =  •  •  •. 


Example  8.14 


McCarthy’s  91-function  /  :  N  — >  N  is  defined  as  follows. 


f  n  —  10,  if  n  >  100; 

/(«)  = 

l  f(f(n+  11)),  if  n  <  100. 

This  function  is  recursively  defined,  but  not  inductively  defined.  Because  of 
this,  it  is  difficult  even  to  see  that  this  definition  is  well-founded  -  that  is, 
that  it  even  defines  a  value  for  each  argument.  In  actual  fact,  /(n)  =  91  for 
each  n  <  100,  and  /(n)  =  n— 10  for  each  n  >  100. 


(^Exercise  8.15^)  (Solution  on  page  443) 


Prove  that  McCarthy’s  91-function  does  indeed  satisfy  f(n)  =  91  for  each 
n  <  100,  and  /(n)  =  n— 10  for  each  n  >  100. 


Example  8.15 


Consider  the  following  function  /  :  N  — >  N. 
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1,  if  n  <  1; 

/(n)  =  ‘  f(n/ 2),  if  n  >  1  even; 

/(3n+l),  if  n  >  1  odd. 

We  can  attempt  to  calculate  the  first  few  values  of  /: 


m = i 
m = i 
m = /(i)  =  i 

/( 3)  =  /(10)  =  /( 5)  =  /(16)  =  /( 8)  =  /(4)  =  /( 2)  =  /(l)  =  1 
/(4)  =  /(2)  =  /(l)  =  1 

/(5)  =  /(16)  =  /( 8)  =  /(4)  =  /(2)  =  /( 1)  =  1 

/(6)  =  /( 3)  =  /(10)  =  /( 5)  =  /(16)  =  /( 8)  =  /(4)  =  /( 2)  =  /(l)  =  1 
/( 7)=  /( 22)  =  /(ll)  =  /(34)  =  /(17)  =  /(52) 

=  /(26)  =  /(13)  =  /(40)  =  /( 20)  =  /(10) 

=  /( 5)  =  /(16)  =  /( 8)  =  /(4)  =  /(2)  =  /(l)  =  1 
/(8)  =  /(4)  =  /( 2)  =  /(l)  =  1 

We  quickly  realise  that  the  value  of  the  function  must  be  1  -  if  it  has  a  value: 
the  only  value  it  could  have  on  some  input  n  is 


f(n)  =  ■■■  =  /( 1)  =  1. 

Indeed,  this  function  seems  to  be  well-defined:  we  don’t  seem  to  get  into 
any  cycles  like 


f(n)  =  •••  =  /(n); 

and  we  always  seem  eventually  to  “bottom  out”  at  /( 1)  =  1,  although 
the  route  to  this  is  rather  chaotic:  it  took  6  unrollings  of  the  function 
definition  to  compute  /( 5),  9  unrollings  to  compute  /( 6),  and  17  unrollings 
to  compute  /( 7).  It  takes  11  unrollings  to  compute  /(26)  (as  can  be  seen 
in  the  calculation  of  /( 7)  above),  but  it  takes  no  fewer  than  112  unrollings 
to  compute  /( 27),  including  computing  /( 9232)  along  the  way  which  itself 
requires  only  35  unrollings. 

It  is  unknown  whether  or  not  this  function  is  in  fact  well  defined,  that 
is,  that  every  sequence  n,  f(n),  /2(n),  /3(n),  ...  eventually  arrives  at  1, 
although  it  has  been  confirmed  for  all  numbers  up  to 

n  =  5.764  x  1018  =  5,  764,  000,  000,  000,  000,  000. 


The  Collatz  conjecture  is  the  unproven  claim  that  this  sequence  does 
converge  to  1  regardless  of  the  starting  value  n. 
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Recursive  Procedures 


As  the  data  manipulated  by  computer  programs  is  typically  defined  induc¬ 
tively,  it  should  come  as  no  surprise  that  programs  typically  manipulate 
this  data  recursively.  That  is,  programs  are  written  to  run  on  some  input 
data  by  recursively  calling  themselves  to  run  on  (generally  smaller)  input  - 
unless  the  input  data  is  so  trivial  that  the  program  can  immediately  solve 
the  problem  at  hand. 


Example  8.16J  Insertion  Sort 

Consider  the  problem  of  sorting  a  list  of  integers  into  increasing  order.  One 
method  for  doing  this,  called  Insertion  Sort,  works  as  follows: 

1.  If  the  list  only  has  only  one  element  in  it,  then  there  is  nothing  to  do: 
the  list  is  clearly  already  sorted. 

2.  Otherwise,  put  the  top  card  to  one  side  and  sort  the  remaining  cards. 

3.  Insert  the  reserved  card  into  the  correct  position  in  the  sorted  list. 

This  breaks  the  problem  of  sorting  a  list  of  numbers  down  to  that  of  sorting 
a  smaller  list.  But  the  trick  is  that  this  procedure  is  applied  recursively  in 
Step  2:  the  smaller  list  is  itself  sorted  by  the  same  procedure  of  putting  one 
card  to  the  side  and  (recursively)  sorting  the  remaining  cards  -  again  with 
the  above  procedure  -  before  inserting  the  reserved  card  into  the  resulting 
sorted  list. 

This  procedure  is  based  on  the  following  function  defined  inductively 
over  lists  of  numbers: 

isort  [  ]  =  [ 

isort  (a  :  L)  =  ( insert  a)  ( isort  L ) 

The  definition  of  the  auxiliary  function  ( insert  a),  which  inserts  the  number 
a  into  a  sorted  list,  is  left  as  an  exercise. 


Exercise  8.16J  (Solution  on  page  444) _ 

Define  the  function  ( insert  a)  inductively  over  (sorted)  lists  of  numbers. 
Your  definition  should  look  as  follows: 

( insert  a)  [  }  —  . . . 

[insert  a)  [b  :  L)  =  •  •  •  ( insert  a)  L 


You  can  use  the  insertion  sort  procedure  to  sort  a  deck  of  52  cards  into 
some  fixed  order,  say  Ace  through  King,  with  all  of  the  Clubs  first,  followed 
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by  the  Diamonds,  then  Hearts,  and  finally  the  Spades.  To  sort  the  cards, 
you  put  the  top  one  down  onto  a  table  and  sort  the  remaining  51  cards;  to 
do  this,  you  put  the  top  one  down  onto  the  table  and  sort  the  remaining  50 
cards;  continuing  in  this  way,  you  will  eventually  find  yourself  with  one  card 
in  your  hand  and  51  cards  on  the  table,  which  you  pick  up  one-by-one  and 
insert  into  the  correct  place  into  the  cards  you  are  holding  in  your  hand. 

By  doing  this,  the  essence  of  recursion  is  hidden;  the  procedure  could 
simply  start  with  all  52  cards  on  the  table,  and  picking  them  up  and  inserting 
them  one-by-one  into  your  hand,  as  many  bridge  players  are  accustomed  to 
doing.  The  following  example,  however,  gives  a  good  example  of  the  power 
of  recursion  in  providing  a  sorting  procedure  which  works  much  faster  in 
practice  than  insertion  sort. 


Example  8.17J  Merge  Sort _ 

Another  method  for  sorting  a  list  of  numbers,  called  Merge  Sort,  works  as 
follows: 

1.  If  the  list  only  has  only  one  element  in  it,  then  there  is  nothing  to  do: 
the  list  is  clearly  already  sorted. 

2.  Otherwise,  divide  the  list  into  two  equal-sized  lists  (plus-or-minus  one 
number,  if  the  list  consists  of  an  odd  number  of  integers). 

3.  Sort  each  of  the  two  shorter  lists. 

4.  Merge  the  two  sorted  lists  together  to  produce  the  desired  sorted  list. 

This  breaks  the  problem  of  sorting  a  list  of  numbers  down  to  that  of  sorting 
two  smaller  lists.  But  the  trick  is  that  this  procedure  is  applied  recursively 
in  Step  2:  the  two  half-sized  lists  are  each  sorted  by  the  same  procedure  of 
dividing  them  into  equal-sized  lists  and  (recursively)  sorting  them  -  again 
with  the  above  procedure  -  before  merging  them  together. 


This  procedure  can  be  elegantly  demonstrated  by  having  a  group  of 
people  sort  a  deck  of  cards.  Everyone  in  the  group  is  to  carry  out  the 
following  procedure  if  they  are  handed  a  pile  of  cards: 

1.  If  there  is  only  one  card  in  the  pile  that  they  are  handed,  then  hand 
the  pile  right  back  to  the  person  who  gave  it  to  you. 

2.  Otherwise,  split  the  pile  into  two  equal-sized  piles  and  pass  these 
smaller  piles  to  two  other  people  who  are  not  holding  any  cards. 

3.  Take  each  of  the  two  piles  back  when  they  are  handed  back  to  you. 
You  will  discover  that  -  as  if  by  magic  -  these  two  piles  are  each  sorted. 

4.  Merge  these  two  sorted  piles  into  one  sorted  pile,  and  hand  this  sorted 
pile  back  to  the  person  who  gave  it  to  you. 
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Exercise  8.17J  (Solution  on  page  444) _ 

Figure  8.1  depicts  the  puzzle  of  the  Towers  of  Hanoi  in  which  we  have 
three  pegs  and  a  number  of  discs  of  varying  diameter;  each  disc  has  a  hole  in 
its  centre  allowing  it  to  be  positioned  on  the  pegs.  Starting  with  all  of  the 
discs  on  the  first  peg  in  increasing  size  with  the  largest  on  the  bottom  and 
the  smallest  on  the  top,  the  puzzle  is  to  move  all  of  the  discs  to  a  different 
peg  by  moving  discs  one  at  a  time  from  peg  to  peg  without  ever  placing  any 
disc  on  top  of  a  smaller  disc. 

Describe  a  recursive  procedure  for  solving  this  puzzle.  How  many  in¬ 
dividual  disc  moves  would  your  procedure  take  on  the  five-disc  puzzle  in 
Figure  8.1? 


Additional  Exercises 


1.  Consider  the  two  quotes  given  at  the  start  of  this  chapter  and  the  next 
chapter.  Only  one  of  these  properly  underlies  the  principle  of  inductive 
definitions.  Why  is  this?  (Hint:  Consider  what  the  base  case  may  be 
in  each  quote.) 

2.  Consider  the  set  X  C  N  defined  as  follows. 

(a)  0  g  X. 

(b)  if  n  j§  X  then  (n+3)  G  X  and  (n+ 7)  G  X. 

(c)  Nothing  is  in  X  unless  its  membership  can  be  established  from 
the  above. 


Give  three  elements  of  N  which  are  elements  of  X,  and  three  elements 
of  N  which  are  not  elements  of  X,  explaining  for  each  one  why  it  is  or 
is  not  an  element. 

Can  you  give  a  complete  description  of  the  set  X ? 
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3.  Describe  the  set  P  defined  as  the  smallest  set  satisfying  the  following: 

(a)  {e,0,l}C  P. 

(b)  if  w  e  P  then  {OwO,  lu/1}  C  P. 

Give  three  elements  of  {0, 1}*  which  are  elements  of  P,  and  three 
elements  of  {0, 1}*  which  are  not  elements  of  P,  explaining  for  each 
one  why  it  is  or  is  not  an  element. 

4.  Give  an  inductive  definition  of  the  function  nodecount(t )  which  com¬ 
putes  the  number  of  internal  nodes  in  the  binary  tree  t,  where  the 
definition  of  a  binary  tree  is  as  given  in  Example  8.7.  Us  this  function 
to  verify  that 

nodecount(^N(N(-k,  *),  1V(1V(*,  N (*,*)),  *))j  =  5. 

5.  Give  a  BNF  equation  for  (a  fragment  of)  your  favourite  programming 
language. 

6.  Given  an  inductive  definition  of  the  function  listnames(d)  which  takes 
a  dictionary  of  names  d,  as  defined  by  Exercise  8.7,  and  produces  a 
list  of  names  in  alphabetic  order  (assuming  the  names  are  properly 
arranged  alphabetically  in  the  dictionary). 

7.  Give  an  inductive  definition  for  a  function  which  takes  a  formula  of 
predicate  logic  and  returns  an  equivalent  formula  in  which  negation 
symbols  appear  only  applied  to  predicates. 

8.  Give  an  inductive  definition  of  the  function  rev  which  takes  a  list  and 
returns  its  reverse.  Thus,  for  example,  rev(  [1,  2,  4] )  =  [4,  2, 1]. 

Use  your  definition  to  compute  rev(  [1,2,4]). 

9.  Male  bees  hatch  from  unfertilised  eggs,  and  so  have  a  mother  but  no 
father.  Female  bees  hatch  from  fertilised  eggs,  and  so  have  both  a 
mother  and  a  father.  The  family  tree  of  a  male  bee  can  be  seen  in 
Figure  8.2  How  many  ancestors  does  a  male  bee  have  in  the  tenth 
generation  back?  How  many  of  these  ancestors  are  male? 

10.  Give  an  inductive  definition  of  the  function  msort  upon  which  merge 
sort  is  based.  You  will  want  to  define  auxiliary  functions  split  which 
splits  a  list  into  two  equal-size  lists,  and  merge  which  merges  two 
sorted  lists  into  one  list. 

11.  Ackermann’s  Function  is  defined  inductively  as  follows.  For  n  >  0, 

A(0,  n)  =  n  +  1; 
and  for  m,n>  1, 

A(m,  0)  =  A(m— 1,1)  and 
A(m,n )  =  A(m— 1,  A(m,  n— 1)). 
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Figure  8.2:  A  family  tree  of  male  (cf )  and  female  (9)  bees.  | 

This  is  an  extremely  fast  growing  function.  For  example,  that  value  of 
A(4,  2)  has  19,  729  decimal  digits;  and  the  value  of  A(4,  3)  is  already 
well  beyond  astronomical. 

(a)  Work  out  the  first  few  values  of  A(  1,  n)  to  convince  yourself  that 
A(l,  n)  =  n+2. 

(b)  Work  out  the  first  few  values  of  A{2,  n)  to  convince  yourself  that 
A{2,  n )  =  2n+3. 

(c)  Work  out  the  first  few  values  of  j4(3,  n)  to  convince  yourself  that 
j4(3,  n)  =  2"+3  -  3. 

(d)  Work  out  the  value  of  A(4, 1). 


Chapter  9 

Proofs  by  Induction 


In  the  middle  of  a  cloudy  thing  is  another  cloudy  thing,  and  within 
that  another  cloudy  thing,  inside  which  is  yet  another  cloudy  thing... 

...  and  in  that  is  yet  another  cloudy  thing,  inside  which  is  something 
perfectly  clear  and  definite. 

-  Ancient  Sufi  saying. 

One  of  the  most  common  forms  of  reasoning  used  within  the  subject  of  Com¬ 
puter  Science  is  inductive  reasoning .  This  is  due  to  the  fact,  explored  in 
the  previous  chapter,  that  Computer  Science  deals  heavily  with  manipulat¬ 
ing  inductively-defined  objects.  Reasoning  about  such  objects  will  naturally 
rely  on  exploiting  the  inductive  nature  of  their  definitions. 

In  Section  5.6  we  explored  the  general  technique  for  proving  a  property 
of  the  form  MxP{x),  namely,  to  allow  x  to  stand  for  an  arbitrary  value  of  the 
domain  and  to  prove  that  P( x)  holds  without  making  any  assumptions  about 
the  value  of  x.  Such  a  general  approach  is  typically  too  weak  to  prove  facts 
about  natural  numbers;  we  would  like  to  be  able  to  exploit  the  inductively- 
defined  structure  of  natural  numbers  to  arrive  at  our  result.  Such  is  the  role 
of  induction  proofs. 


Convincing  but  Inconclusive  Evidence 


Consider  the  following  claim  that  the  sum  of  the  first  n  positive  integers  is 
n(n+ 1) 

3  : 


Claim:  For  all  n  >  0,  1  +  2  +  3  +  •  •  •  +  n  =  n(n+1\ 

Note  that  the  sum  of  the  first  zero  natural  numbers,  which  above  is  awk¬ 
wardly  written  as  1  +  2  +  3  +  •••  +  0,  is  naturally  0. 

We  can  easily  confirm  this  claim  for  various  values  of  n: 


F.  Moller,  G.  Struth,  Modelling  Computing  Systems, 

Undergraduate  Topics  in  Computer  Science, 

DOI  10. 1007/978- 1-84800-322-4_10,  ©  Springer- Verlag  London  2013 
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1 

1  +  2 
1  +  2  +  3 
1+2+3+4 


0  -  °W 
U  -  “2“ 

=  1  =1W 

=  3=^ 
=  6=^ 
=  10=^1 


so  the  claim  is  true  when  n  =  0. 

so  the  claim  is  true  when  n  =  1. 

so  the  claim  is  true  when  n  =  2. 

so  the  claim  is  true  when  n  =  3. 

so  the  claim  is  true  when  n  =  4. 

so  the  claim  is  true  when  n  —  5. 

Each  instance  of  the  claim  which  we  verify  to  be  true  seems  to  lend  support 
to  the  validity  of  the  claim.  However,  no  (finite)  amount  of  checking  of 
individual  cases  can  confirm  the  validity  of  the  claim  for  all  values  of  n. 
Now  consider  each  of  the  following  claims. 


1  +  2  +  3  +  4  +  5  =  15- 


•  Fermat’s  Last  Theorem  claims  that  for  no  integer  n> 2  does  there 
exist  a  trio  of  positive  integers  x,  y  and  2  such  that  xn  +  yn  =  z".  This 
claim  went  unproven  for  350  years  until  Andrew  Wiles’  celebrated 
proof  in  the  1990s.  By  then,  the  conjecture  was  confirmed  with  the 
help  of  vast  computer  resources  for  all  values  of  n  up  to  4  million. 
However,  even  if  computers  could  have  confirmed  the  truth  of  this 
conjecture  for  all  values  of  n  up  to  ten  zillion,  there  would  still  be  no 
reason  why  the  conjecture  should  be  true  for  ten  zillion  and  one. 
Pierre  de  Fermat,  after  whom  Fermat’s  Last  Theorem  is  named,  fa¬ 
mously  wrote  the  following  about  this  Theorem  in  the  margin  of  a  text¬ 
book  on  arithmetic:  “Cuius  rei  demonstrationem  mirabilem  sane 
detexi.  Hanc  margims  exiguitas  non  caperet.  ”  (“I  have  a  truly  mar¬ 
vellous  proof  of  this  proposition  which  this  margin  is  too  narrow  to 
contain.”)  It  is  universally  believed  that  whatever  argument  he  may 
have  had  in  mind  could  not  have  been  valid.  This  is  partly  due  to 
the  fact  that  no  proof  was  ever  found  amongst  his  papers,  and  partly 
due  to  the  extreme  complexity  of  the  only  known  proof  by  Wiles  - 
which  can  be  understood  in  its  entirety  by  only  a  small  number  of 
mathematicians  worldwide.  It  also  partly  due  to  the  fact  that  Fermat 
believed  many  things  which  ultimately  turned  out  to  be  false,  such  as 
the  next  example. 

•  Fermat  numbers  are  integers  of  the  form  Fn  =  22”  +  1.  They  are  so 
called  on  account  of  the  fact  that  Pierre  de  Fermat  wrote,  in  a  letter 
to  Marin  Mersenne  on  25  December  1640,  that:  “If  I  can  determine 
the  basic  reason  why 


3,  5,  17,  257,  65,537,  ... 

are  prime  numbers,  I  feel  that  I  would  find  very  interesting  re¬ 
sults.  ”  Based  on  the  properties  of  the  first  few  numbers  of  this  form, 
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Fermat  believed  that  they  were  all  necessarily  prime.  Indeed  the  first 
few  Fermat  numbers  listed  by  Fermat  are  prime: 


F0  = 

22°  +  1 

=  21  +  1 

=  3 

F i  = 

22’  +  1 

=  22  +  1 

=  5 

f2  = 

22'  +  1 

=  24  +  1 

=  17 

f3  = 

223  +  1 

=  28  +  1 

=  257 

F,  = 

224  +  1 

=  216  +  1 

=  65,537 

Unfortunately  for  Fermat,  his  conjecture  fails  with  the  very  next  Fer¬ 
mat  number: 


F5  =  22‘  +  1  =  232  +  1  =  4,294,967,297. 

Fermat  can  be  forgiven  for  not  recognising  this  monstrosity  to  be  a 
composite  number.  It  was  the  great  mathematician  Leonhard  Euler 
who  first  discovered  in  1732  that  this  number  can  be  factored  as 

641  x  6,  700,417. 

Indeed,  it  is  unknown  whether  any  further  Fermat  numbers  are  prime 
(though  it  is  known  that  a  vast  many  are  not). 

•  Goldbach’s  conjecture,  which  states  that  every  even  number  greater 
than  2  can  be  expressed  as  the  sum  of  two  prime  numbers,  has  been 
confirmed,  again  with  the  help  of  vast  computer  resources,  for  all  even 
numbers  up  to  1018  (i.e.,  1,000,000,000,000,000,000).  But  as  far  as 
anyone  knows,  there  might  be  a  yet  larger  even  number  which  is  not 
the  sum  of  two  primes.  It  worked  out  well  for  Fermat’s  Last  Theorem, 
but  this  gives  no  reason  for  hope,  as  demonstrated  by  the  next  two 
examples. 

•  In  1919,  the  Hungarian  mathematician  George  Polya  conjectured  that 
most  (i.e.,  more  than  50%)  of  the  natural  numbers  less  than  any  given 
number  have  an  odd  number  of  prime  factors.  For  example,  every 
prime  number  has  an  odd  number  of  prime  factors,  namely  one,  as 
does  12  =  2x2x3  (three  prime  factors),  while  14  =  2x7  has  an  even 
number  (two)  of  prime  factors.  By  the  mid  1950’s  empirical  evidence 
for  Polya’s  conjecture  seemed  clear:  the  conjecture  was  verified  for  all 
numbers  up  to  800,000.  However,  contrary  to  this  ever-growing  evi¬ 
dence,  Polya’s  conjecture  was  disproved  in  1958  when  C.  Brian  Hasel- 
grove  showed  that  it  had  to  be  false  for  some  value  around  2xl0361 
(that  is,  a  2  followed  by  361  zeros).  It  has  since  been  shown  to  fail 
already  for  n  =  906, 150,  257. 


Consider  the  following  claim: 
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For  all  n  >  1,  991  n2  +  1  is  not  a  perfect  square; 

that  is,  \/991n2+l  is  not  an  integer. 

We  could  confirm  the  validity  of  this  claim  for  as  many  values  of  n 
as  we  have  patience,  but  we  could  never  conclude  on  the  basis  of  the 
validity  of  a  large  number  of  cases  that  the  claim  is  valid  for  all  values 
of  n.  The  claim  is  in  fact  false;  however,  the  first  value  of  n  for  which 
the  claim  fails  is 

n  =  12,055,735,790,331,359,447,442,538,767. 

We  cannot  be  content  with  the  mere  experience  of  witnessing  various  in¬ 
stances  of  when  a  claim  is  true  to  lend  reckless  support  to  its  universal 
truth.  We  cannot  confidently  lend  any  credence  to  Collatz’s  conjecture  of 
Example  8.15  despite  the  comfort  offered  by  the  knowledge  it  holds  for  all 
values  up  to  2.22  x  1018.  Similarly,  and  more  worrisome,  a  train  may  run  per¬ 
fectly  for  arbitrarily  long  -  several  years  even  -  before  a  fault  in  its  software 
control  system  contributes  to  a  devastating  crash. 


Exercise  9.1^)  (Solution  on  page  444) _ 

Some  number  of  spots  are  placed  randomly  around  the  circumference  of  a 
circle,  and  every  spot  is  connected  to  every  other  spot  by  a  straight  line. 
Assuming  that  no  three  lines  intersect  at  a  point  inside  the  circle,  we  would 
like  to  know  into  how  many  regions  is  the  circle  divided? 

For  example,  given  1,  2,  3,  4,  or  5  spots,  the  circle  is  divided  into  1, 
2,  4,  8,  or  16  regions,  respectively;  the  final  three  of  these  are  depicted  in 
Figure  9.1. 

How  many  regions  are  created  by  connecting  six  spots? 
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A  Primary  School  Induction  Argument 


Suppose  you  wish  to  check  that  the  formula 


1  +  2  +  3  +  •  ■  ■  +  n 


n(n+ 1) 


is  true  for  the  first  30  values  of  n,  and  you  ask  a  classroom  of  30  ten-year- 
olds  to  check  this  formula,  each  child  checking  if  it  is  true  for  some  value  of 
n.  For  example,  the  17th  child  will  check  that 

1  +  2 +  3  + •••  +  17  =  17^18. 


You  watch  each  child  working  diligently  on  their  individual  problems  and, 
as  expected  the  first  few  children,  working  on  confirming  the  formula  for 
small  values  of  n,  are  quick  to  report  their  success.  Those  working  on  larger 
values  of  n  are  taking  longer.  For  example,  it  is  taking  a  long  while  for  the 
17th  child  to  add  up  the  first  17  numbers  to  find  they  add  up  to  153,  and 
then  to  compute  =  153  to  discover  that  the  claim  is  true  for  n=17. 

Some  children  are  reporting  failure  before  checking  their  work  and  finding 
errors  in  their  calculations  before  ultimately  reporting  success. 

Alone  in  the  crowd  is  the  28th  child,  a  little  girl  who  is  sitting  quietly 
reading  a  novel  instead  of  working  away  on  her  calculations.  You  ask  her  if 
she  is  done,  and  she  says  yes.  You  ask  her  if  the  formula  is  true  for  n=28 
and  she  says  she  doesn’t  know  -  yet.  Confused,  you  look  at  her  sheet  of 
paper  and  see  the  following  calculation: 

1  +  2  +  3+  ---  +  28  =  1  +  2  +  3  +  •  •  •  +  27  +  28 


27x28 

2 


+  28 


=  28  x 

28x29 
~ 2~ 

As  you  look  over  this  calculation,  the  boy  at  the  next  desk  announces  that 
he  has  finished  adding  up  the  first  27  numbers  and  that  they  add  up  to 
378  =  as  expected:  the  formula  is  true  for  n= 27.  The  little  girl 

immediately  responds  to  this  by  announcing  that  the  formula  is  true  for 
n= 28. 

What  this  precocious  little  girl  realised  was  that  she  could  leave  most  of 
the  hard  work  of  adding  up  the  first  28  numbers  to  her  friend  beside  her, 
the  little  boy  who  is  busily  adding  up  the  first  27  numbers.  Once  he  has 
done  that,  all  she  needs  to  do  is  add  28  to  his  total.  Knowing  what  the  first 
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27  numbers  are  supposed  to  add  up  to,  namely  2  ,  s^e  doesn’t  wait  for 

him  to  do  his  job,  but  rather  goes  to  work  under  the  assumption  that  her 
friend  will  confirm  this  expectation.  This  is  the  calculation  that  she  carried 
out. 

Having  carried  out  this  calculation,  can  she  say  that  the  first  28  numbers 
add  up  to  2  ?  Not  right  away,  as  she  made  the  assumption  that  the 

first  27  numbers  add  up  to  once  her  friend,  the  27th  child,  confirms 

this  assumption,  she  can  (and  does)  announce  boldly  that  the  formula  is 
true  for  n= 28. 

There  is  nothing  special  about  the  number  28,  just  something  special 
about  this  little  girl.  If  she  had  the  problem  of  checking  the  formula  for 
any  other  number,  she  would  have  done  the  same  thing.  She  was  no  doubt 
quietly  wondering  why  her  friend  beside  her  was  busily  adding  up  all  the 
first  27  numbers;  and  indeed  why  her  other  friend  on  her  other  side  was 
busily  adding  up  the  first  29  numbers. 

(^Exercise  9.2  )  (Solution  on  page  445) 

What  calculation  would  this  little  girl  do  if  she  was  the  27th  child? 


(^Exercise  9.3^)  (Solution  on  page  445) _ 

When  he  was  ten  years  old,  the  great  mathematician  Carl  Friedrich  Gauss 
was  reportedly  set  the  problem  of  adding  up  the  first  100  numbers.  His 
teacher’s  intention  was  to  keep  the  class  busy  and  quiet  for  some  time,  but 
Gauss  solved  the  problem  almost  immediately.  What  clever  trick  did  young 
Gauss  employ? 


The  Induction  Argument 


Just  as  we  can  inductively  define  functions  over  inductively-defined  domains, 
we  can  exploit  the  structure  of  an  inductive  definition  to  reason  about  the 
objects  it  defines.  For  example,  mathematical  induction  allows  you  to 
prove  that  a  property  P(n)  of  natural  numbers  n  e  N  holds  for  all  natural 
numbers  if: 


1.  (Base  Case)  it  holds  for  the  value  0,  that  is,  P( 0);  and 

2.  (Induction  Step)  it  holds  for  the  value  k  + 1  whenever  it  holds  for  fc; 
that  is, 


P{k )  =4-  P{k+ 1). 
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P(k )  is  referred  to  as  the  inductive  hypothesis,  from  which  we  want 
to  deduce  P(k+ 1). 

Clause  2  can  be  equally  expressed  as  follows 
2'.  it  holds  for  the  value  k  >  0  whenever  it  holds  for  k— 1;  that  is, 

P(k-l)  =4-  P(k). 

The  little  girl  discussed  above  did  precisely  this  type  of  reasoning  in  showing 
that  the  property  P(n),  which  states  that  the  first  n  numbers  add  up  to 
— — L,  holds  for  the  value  28  assuming  it  holds  for  the  value  27. 

As  an  analogy,  imagine  a  (possibly  infinite)  string  of  dominoes  standing 
side-by-side  as  in  Figure  9.2.  If  we  can  prove  that  the  first  domino  falls 
(i.e. ,  gets  pushed  over),  and  that  if  one  domino  falls,  the  next  domino  will 
fall  (i.e.,  gets  pushed  over  by  the  preceding  domino),  then  this  is  enough  to 
conclude  that  all  of  the  dominoes  will  fall  over. 

We  can  think  of  induction  as  a  method  of  extending  our  knowledge  of 
the  truth:  we  establish  the  claim  for  the  first  relevant  value  (typically  0). 
Next  we  show  that  if  the  claim  is  true  for  some  value  k  then  it  must  also  be 
true  for  the  next  value  fc+1.  The  important  thing  to  note  here  is  that  k  is 
not  given  a  specific  value  although  it  might  have  some  conditions  imposed 
on  it  (in  this  case  k  >  0).  Now  since  we  know  the  claim  to  be  true  for  0,  it 
must  also  be  true  for  1;  but  then  it  must  also  be  true  for  2;  but  then  it  must 
also  be  true  for  3;  and  continuing  in  this  fashion,  we  realise  that  the  claim 
must  be  true  for  any  value  n  G  N.  In  this  way  we  are  viewing  induction 
proofs  as  a  form  of  bootstrapping  argument,  as  depicted  in  Figure  9.3. 

Alternatively,  we  can  think  of  induction  as  a  proof  by  contradiction:  if 
the  claim  is  false  -  that  is,  if  the  property  does  not  hold  for  all  values  of 
n  G  N  -  then  it  must  fail  for  some  smallest  value  n  >  0;  that  is,  the  claim 
holds  for  all  values  less  than  n  but  not  for  n  itself.  The  question  then  is: 
what  can  n  be?  It  cannot  be  0,  as  the  base  case  established  that  the  claim 
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holds  for  n— 0.  But  then  by  the  induction  step,  n  cannot  be  1  either;  and 
hence  not  2  either;  and  hence  not  3  either;  and  hence  not  4  either.  We  can 
carry  on  this  reasoning  indefinitely  to  show  that  n  cannot  be  any  value;  for 
example,  n  cannot  be  1,  594  since,  being  the  smallest  value  for  which  the 
claim  is  false,  the  claim  would  be  true  for  1,  593,  and  thus  by  the  induction 
step  it  must  also  be  true  for  1,594.  Continuing  in  this  fashion,  we  realise 
our  contradiction:  the  claim  cannot  actually  fail  for  any  value  n  e  N. 

Following  this  extensive  discussion,  we  can  finally  offer  the  first  formal 
proof  by  induction,  as  a  model  on  which  to  base  all  other  induction  proofs. 


Example 


Fact:  For  all  n  >  0, 

1  +  2  +  3+  --  -  +  71  = 
Proof:  By  induction  on  n. 

Base  Case:  We  note  that 

1  +  2  +  3  +  •••  +  0 


n(n+ 1) 
2 


=  0  = 


0(0+1) 

2 


Induction  Step:  We  assume  that,  for  some  k, 


1  +  2  +  3  + 


fc(fc+ 1) 


and  from  this  assumption  (the  inductive  hypothesis)  we  prove  that 
l  +  2  +  3+  ---  +  fc  +  (fc+1)  =  (fc+1Xfc+2). 
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That  is,  we  demonstrate  that  if  the  statement  of  the  theorem  is  true 
when  n  =  fc,  then  it  must  also  be  true  when  n  =  fc+1. 

By  the  inductive  hypothesis  we  can  rewrite  the  left-hand  side  of  this 
equation  that  we  want  to  prove  true  as 

- +  (fc+1). 

We  can  then  take  out  the  common  factor  (fc+1)  from  these  two  terms, 
giving  us 

(fc+1)  ^  |  +  lj , 

which  is  the  same  as 
(fc+1)  (^±2), 

or  in  other  words, 

(fc+l)(fc+2) 

2  ’ 

which  is  the  right-hand  side  that  we  desire. 

In  other  words,  we  carried  out  the  following  equational  derivation: 
l  +  2  +  3+  --  .  +  fc  +  (fc+1) 

fc(fc+l) 

=  —^"2 — -  +  (fc+1)  (by  the  inductive  hypothesis) 

=  (fc+1)  1^  g  +  l) 

=  (fc+1) 

_  (fc+l)(fc+2) 

*  2  ' 


At  this  point  you  should  reflect  on  what  the  little  girl  in  the  Primary 
School  problem  from  Section  9.2  did,  and  relate  it  to  the  induction  step  of 
the  above  argument.  If  her  reasoning  is  clear,  the  following  formulae  should 
be  straightforward  to  verify. 


(^Exercise  9.T)  (Solution  on  page  446) 


Show,  by  induction  on  n,  that  the  following  formulae  are  true  for  all  n  >  0. 


1.  I2  +  22  +  32  +  •••  +  n2  = 

2.  1  +  3  +  5  +  •••  +  (2n— 1) 


n(n+l)(2n+l) 


232  Proofs  by  Induction 


3.  1-2  +  2-3  +  3-4  +  •••  +  n(n+ 1)  =  n(n+1Kn+2) . 

(^Exercise  9.5J)  (Solution  on  page  447) 

Show,  by  induction  on  n,  that  for  all  n  >  0: 

F0x  F1x---x  Fn  =  Fn+1  -  2 
where  Fn  =  22"  +  1  are  the  Fermat  numbers. 


Induction  is  a  very  common  technique  for  establishing  mathematical  for¬ 
mulae  such  as  the  following. 


^Exercise  9.6^)  (Solution  on  page  448) 


Show,  by  induction  on  n,  that  for  any  real  number  r  1, 


1  +  r  +  r2  +  r3 


]_  _  ^n+1 

1  —  r 


for  all  n  >  0. 


Note  that  if  —  1  <  r  <  1  then  rn+1  approaches  0  as  n  approaches 
infinity;  hence,  as  a  corollary  to  the  above,  we  can  deduce  that  for  any  real 
r  with  |r|  <  1, 

1  +  r  +  r2  +  r3  +  •••  =  T^_. 


So  far  we  have  used  induction  merely  to  prove  simple  formulae.  However, 
induction  is  more  general  than  this,  and  the  base  case  can  be  some  value  or 
values  other  than  0,  as  the  next  examples  demonstrate. 

(^Example  9.6^) _ 

Fact:  Any  amount  of  postage  of  at  least  8  pence  can  be  made  up  from 
just  3-pence  and  5-pence  stamps. 

Proof:  By  induction  on  n. 

Base  Case:  A  3-pence  stamp  and  a  5-pence  stamp  make  up  8  pence. 

Induction  Step:  Assume  that  we  have  a  collection  of  such  stamps 
adding  up  to  a  total  of  n  >  8  pence. 

•  if  there  is  a  5-pence  stamp  in  this  collection,  remove  it  and 
replace  it  with  two  3-pence  stamps; 

•  If  there  are  no  5-pence  stamps,  then  there  must  be  (at  least) 
three  3-pence  stamps  in  the  collection;  remove  these  and  re¬ 
place  them  with  two  5-pence  stamps. 
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In  either  case,  we  arrive  at  a  collection  of  stamps  adding  up  to 
(n+1)  pence.  □ 


Fact:  The  sum  of  the  interior  angles  of  a  convex  polygon  with  n  sides  is 
equal  to  (n— 2)180°  for  all  n  >  3.  (A  polygon  is  convex  if  every  line 
joining  two  points  of  the  polygon  lies  within  the  polygon.) 


Proof:  By  induction  on  n. 

Base  Case:  The  sum  of  the  interior  angles  of  any  triangle  is  180°. 

Induction  Step:  We  assume  that  the  theorem  is  true  for  some  value 
fc> 3:  that  the  sum  of  the  interior  angles  of  any  convex  polygon 
with  k  sides  is  equal  to  (k— 2)180°. 

From  this  inductive  hypothesis,  we  demonstrate  that  it  must  also 
be  true  for  fc+1:  that  the  sum  of  the  interior  angles  of  any  convex 
polygon  with  fc+1  sides  is  equal  to  (fc— 1)180°. 

Any  (fc+l)-gon  can  be  decomposed  into  a  triangle 
and  a  fc- gon  by  connecting  two  non-adjacent  vertices, 
as  depicted  in  the  diagram. 

The  sum  of  the  interior  angles  of  this  (fc+l)-gon  is  then  the  sum 
of  the  interior  angles  of  the  triangle,  180°,  added  to  the  sum  of 
the  interior  angles  of  the  fc-gon  which,  by  induction,  is 

180°  +  (fc-2)180°  =  (fc  — 1)180°. 

□ 


Exercise  9.7  J  (Solution  on  page  449) 


Suppose  we  draw  n  circles  (ra>l)  so  that  any  two  intersect  at  two  points  but 
no  three  intersect  at  any  point.  Prove,  by  induction  on  n,  that  these  circles 
divide  the  plane  into  n2— n+2  regions.  Deduce  from  this  that  we  cannot 
draw  a  Venn  diagram  for  four  or  more  sets  with  circles  representing  sets. 


Induction  is  of  immense  importance  in  Computer  Science  where  a  great 
many  of  the  objects  under  study  are  inductively  defined.  It  is  imperative 
that  a  Computer  Scientist  be  comfortable  with  inductive  reasoning  in  or¬ 
der  to  be  successful  with  designing  and  understanding  computing  systems. 
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The  following  provides  an  example  of  reasoning  inductively  about  a  simple 
program. 


Exercise  9.8J  (Solution  on  page  449) _ 

Consider  the  following  piece  of  recursive  program  code: 


function  /(n) 

if  n=  0  then  return  0 
else  return  f(n— 1)  +  2n  —  1 


This  program  code  computes  the  following  inductively-defined  function: 


f(n) 


0,  if  n= 0 

f(n— 1)  +  2n  —  1,  if  n> 0. 


Show,  by  induction  on  n,  that  /(n)  =  n2  for  all  n  >  0. 


9.4)  Strong  Induction 

In  a  proof  by  induction  we  demonstrate  that  some  property  holds  of  some 
number  based  on  the  assumption  that  the  property  holds  of  the  the  previous 
number.  Occasionally  we  may  want  to  assume  that  the  property  holds  of 
other  smaller  numbers,  not  just  the  previous  number.  An  alternative  form 
of  induction  which  permits  this  is  strong  induction  which  allows  you  to 
prove  that  a  property  P{n )  of  natural  numbers  holds  of  all  natural  numbers 
by  demonstrating  the  following: 

•  P(n )  holds  for  n  whenever  it  holds  for  all  k<n;  that  is, 

(Vfccn  :  P(fc))  P(n). 

You  may  well  wonder  at  this  point:  what  happened  to  the  base  case?  In  the 
case  of  n— 0,  the  assumption  that  P(k)  holds  for  all  values  k<n  is  vacuous, 
since  there  are  no  such  values,  and  hence  this  one  clause  incorporates  the 
base  case  of  demonstrating  that  P(0)  holds  under  no  assumption. 


Let 
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Fact:  /(n)  =  n  for  every  n  >  0. 

Proof:  By  (strong)  induction  on  n,  arguing  by  cases  on  the  “structure”  of 
n. 


n^O:  /( 0)  =  0. 

n> 0  even:  /(n )  =  2  •  f(nj 2) 

=  2  •  (n/2)  =  n.  ('By  induction) 
n  odd:  f(n)  =  f(n— 1)  +  1 

=  (n— 1)  +  1  —  n.  (By  induction) 


□ 


^Exercise  9.9^  (Solution  on  page  450) 

Prove,  by  strong  induction,  that  every  integer  n>  1  is  either  prime  or  a 
product  of  primes. 

This  result,  attributed  first  to  Euclid  over  2000  years  ago,  is  referred  to 
as  the  Fundamental  Theorem  of  Arithmetic. 
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We  showed  earlier  how  to  define  functions  inductively,  e.g.,  the  Harmonic 
numbers  Hn  (Example  8.9)  and  the  Fibonacci  numbers  (Example  8.10). 
Induction  proofs  are  naturally  used  to  reason  about  such  inductively-defined 
functions,  as  evidenced  by  the  following  examples. 


Example 


Fact:  For  all  n  >  0, 


Hi  -f-  H2  +  B3  +  •  •  •  +  Hn  —  (n+l)B„  —  n. 
Proof:  By  induction  on  n. 

Base  Case  (n  =  0): 

H\  +  H2  +  Hi  +  •  •  •  +  Hq  =  0  =  (0+1)Bq  —  0. 
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Induction  Step:  (n  >  0): 

Hx  +  H2  +  H3  +  •••  +  Hn 

=  (Hx  +  H2  +  H3  +  1)  +  Hn 

=  nHn-i  —  (n—  1)  +  Hn  (by  inductive  hypothesis) 

=  n[Hn  -  i-)  -  (n— 1)  +  Hn  (since  Hn  =  Hn_ x  + 

=  (n+l)Hn  —  n.  □ 


Exercise  9.10  j  (Solution  on  page  450) 


Prove  that  for  all  m  >  1  and  all  n>  m,  Hn  —  Hm  >  n  nm . 

Do  this  by  assuming  m  >  1  and  proving  the  result  by  induction  on  n. 


Fact:  /o  +  /i  +  /2  +  •  •  •  +  /n  —  /n+2  —  1  f°r  all  n  >  0. 

Proof:  By  induction  on  n. 

Base  Case  (n  =  0): 


/c  •  A  •  /a-’ .  /c  —  fo  —  0  —  1  —  1  —  /2  —  1. 

Induction  Step  (n  >  0): 

/o  +  /l  +  /2  +  '  '  '  +  fn  +  fn  + 1 

=  (/n+2  —  1)  +  fn+ 1  ("it/  the  inductive  hypothesis) 

~  ifn+ 1  +  /n+2)  —  1  =  /n+3  —  1  C 


We  have  seen  that  the  base  case  may  be  some  value  n  other  than  0. 
There  are  also  instances  in  which  more  than  one  base  case  is  required.  A 
simple  example  of  this  is  provided  by  the  following. 
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(^Example  9.11?) _ 

Fact:  For  all  m  >  2  and  for  all  n  >  1,  /„+m_2  =  fnfm-i  +  fn-ifm-2- 

Proof:  We  assume  that  m  >  2  is  fixed,  and  we  prove  the  result  by  induction 
on  n. 

Base  Case  (n  —  1).  fl^-m—2  —  fm  —  1  —  flfm—1  d-  fofm—2' 

Base  Case  (n  =  2):  /2+m_ 2  —  fm  —  fm- 1  +  fm- 2  =  /2/m-i  +  fifm-2- 
Induction  Step:  (n  >2): 

f n-\-m— 2  f (n  —  l)+m— 2  d“  f (n— 2)+m— 2 

—  (.fn—lfm—1  d-  f n — 2  fjn — 2 )  d-  ( fn—2fm—l  d-  fn—2,fm  — 2) 

(by  inductive  hypothesis,  twice) 

=  (/n-l  d-  }n-2)fm-l  d“  (fn- 2  d“  f n-2) f m-2 

=  fnfm-1  +  fn-lfm-2  d 


The  above  proof  required  two  base  cases,  as  the  inductive  hypothesis  is 
invoked  twice  for  the  two  values  n— 1  and  n— 2.  If  in  the  above  proof  we 
only  do  the  base  case  for  n  =  1,  and  in  the  induction  step  we  try  to  cater 
for  all  cases  of  n  >  1  (in  particular,  n  =  2),  then  the  second  invocation  of 
the  inductive  hypothesis  would  be  invalid  in  the  particular  instance  where 
n  =  2. 


Fun  with  Fibonacci  Numbers 


In  this  section  we  explore  three  extended  induction  arguments  involving 
Fibonacci  numbers. 


9.6.1  A  Fibonacci  Number  Test 

Suppose  we  are  given  an  arbitrary  positive  integer  x  and  asked  whether  or 
not  it  is  a  Fibonacci  number.  For  example,  how  might  we  determine  whether 
or  not  the  number  517  is  a  Fibonacci  number?  The  only  apparent  way  is  to 
use  the  inductive  definition  to  compute  successive  Fibonacci  numbers  until 
we  reach  (or  -  more  likely  -  exceed)  517.  This  is,  however,  not  necessary; 
we  can  instead  use  the  following  simple  test: 

A  positive  integer  x  is  a  Fibonacci  number  if,  and  only  if, 

5a;2  ±4  is  a  perfect  square. 
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For  example,  a;=3  is  a  Fibonacci  number,  and  5  •  32  +  4  =  49  =  72;  and  x=5 
is  a  Fibonacci  number,  and  5  •  52  —  4  =  121  =  ll2.  However,  x=4  is  not  a 
Fibonacci  number,  and  neither  5  •  42  —  4  =  76  nor  5  •  42  +  4  =  84  is  a  perfect 
square. 

For  our  less-modest  example  x  =  517  above,  a  few  calculator  keystrokes 
tells  us  that  5-5172  — 4  =  1336441,  and  pressing  the  square  root  button  gives 
us  1156.0454,  so  5a:2  —  4  is  clearly  not  a  perfect  square;  and  5  •  5172  +  4  = 
1336449,  and  pressing  the  square  root  button  gives  us  1156.0489,  so  5a;2  +  4 
is  also  not  a  perfect  square.  Therefore,  this  test  tells  us  that  x  =  517  is  not 
a  Fibonacci  number.  On  the  other  hand,  testing  the  value  a;=610,  a  few 
calculator  keystrokes  tells  us  that  5  •  6102  —  4  =  1860496,  and  pressing  the 
square  root  button  gives  us  1364;  in  this  case  5a:2  —  4  is  a  perfect  square, 
meaning  that  the  value  a;=610  is  a  Fibonacci  number  (indeed  f15  =  610). 

The  following  two  exercises  provide  the  basis  for  the  argument  that  this 
test  is  valid. 


Exercise  9.12 J  (Solution  on  page  451) _ 

Show,  by  induction  on  n,  that  for  all  n  >  0  the  pair  (x,y)  =  (/„,/„+ i) 
satisfies  the  equation 

y2  —  xy  —  x2  =  ±1. 


(^Exercise  9.13)  (Solution  on  page  451) 


Show,  by  induction  on  x+y,  that  if  the  pair  (x,  y)  of  positive  integers  satisfies 
the  equation 


y2  —  xy  —  x2  =  ±1 


then  (a:,  y)  =  (/„,  /„+1)  for  some  n  >  0.  (Hint:  For  the  induction  step,  show 
that  the  “smaller”  positive  integer  pair  [y—x,  x)  also  provides  a  solution.) 


Theorem  9.13J  Fibonacci  Test _ 

A  positive  integer  x  is  a  Fibonacci  number  if,  and  only  if,  5a:2  ±4  is  a  perfect 
square. 


Proof:  We  start  by  recalling  the  quadratic  formula  which  states  that 
the  quadratic  equation 

ay 2  +  by  +  c  =  0 

is  solved  by  the  following  values  of  y: 

„  _  —  b  ±  \/b2  —  4ac 

y  -  2a  ' 
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In  particular,  for  a  given  positive  value  of  x,  the  quadratic  equation 
y2  —  xy  —  x2  =  ±1 

is  solved  by  the  following  positive  value  of  y: 

x  +  Jx2  +  4(z2±l)  _  x  +  ^53,2  ±  4 

y  - - 2 - - - 2 - • 

By  Exercise  9.12,  if  a:  =  /„,  then  the  value  of  y  given  by  this  formula  must 
be  fn+ 1,  from  which  we  can  deduce  that  5a;2  ±  4  must  be  a  perfect  square. 

Conversely,  if  5a:2  ±4  is  a  perfect  square  for  some  positive  integer  x,  then 
the  value  of  y  given  by  this  formula,  like  x,  must  be  a  positive  integer,  in 
which  case  Exercise  9.13  tells  us  that  x  (as  well  as  y)  must  be  a  Fibonacci 
number.  □ 


9.6.2  A  Carrollean  Paradox 

The  following  result  is  known  as  Cassini’s  Identity. 

(^Exercise  9.14-)  (Solution  on  page  452) _ 

Show,  by  induction  on  n,  that  /2+1  —  /„/n+2  =  (—1)”  for  all  n  >  0. 


Cassini’s  Identity  forms  the  basis  of  a  famous  puzzle  devised  by  Lewis 
Carroll.  The  puzzle  is  described  in  the  following  exercise. 

(^Exercise  9.15)  (Solution  on  page  452) 

Take  a  square  whose  sides  are  8  units  long,  cut  it  into  four  sections  (two 
triangles  and  two  quadrilaterals),  and  rearrange  these  four  sections  into  a 
rectangle  whose  sides  are  5  units  and  13  units  long  as  shown  here: 


3  5 


5  8 


The  area  of  the  8x8  square  is  64  square  units,  but  the  area  of  the  5x13 
rectangle  is  65  square  units!  Where  does  the  extra  square  unit  come  from? 

This  same  phenomenon  occurs  with  any  square  whose  sides  are  of  length 
taken  from  the  Fibonacci  numbers.  For  example  consider  the  following 
13x13  square  cut  up  and  rearranged  into  an  8x21  rectangle: 
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13 


13  8 


In  this  case,  the  area  of  the  square  is  169  square  units,  but  the  area  of  the 
rectangle  is  168  square  units,  so  this  time  we  lose  one  square  unit.  Where 
did  it  go? 


9.6.3  Fibonacci  Decompositions 

The  Unique  Prime  Factorisation  Theorem  states  that  any  positive  integer 
n  has  a  unique  decomposition  into  the  product  of  prime  numbers.  For 
example,  the  number  n=364  decomposes  uniquely  into  the  product  of  primes 
as  follows: 

364  =  2  •  2  •  7  -  13. 


The  proof  of  this  theorem  is  carried  out  by  induction  on  n,  and  can  be  found 
in  any  but  the  most  basic  algebra  reference  book.  Here,  we  present  a  similar 
decomposition  result  which  we  shall  find  useful  later. 


Fact:  Every  integer  N  >  0  can  be  expressed  uniquely  as 


N  =  fk1+fk2  +  fk3  +  '''  +  fk„ 
where  0  <C  -C  k2  <C  k3  <C  ■  ■  ■  <C  kn.  (Here,  i  <C  j  means  that  i  <  j— 2.) 
For  example,  100  =  3  +  8  +  89  =  /4  +  f6  +  fn. 

Proof:  First  we  demonstrate,  by  induction  on  n,  that  for  all  n  >  1, 
fki  +  fk2  +  fk3  +  •  •  •  +  fk„  <  fkn+ 1 
whenever  0  <  fci  <  k2  <  k3  <  •  •  •  <  kn. 

Base  Case  (n  =  1):  fk2  <  fk2+i  as  fci  >  2  since  0  <C  fci. 

Induction  Step  (n  >  1): 
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fki  +  fk2  +  fk3  +  '  '  '  +  +  fk„ 

<  fkn-1+i  +  fkn  (by  inductive  hypothesis) 

<  fk„- 1  +  fk„  (since  fc„_i  <  kn,  so  fc„_i+l  <  kn-l) 

=  fk„+ 1  (by  definition) 

Thus  if  N  =  fkl  +  fk2  +  fk3  + - f  fk„  where  0  <  fei  <  k2  <  ■  ■  ■  <  kn 

then  we  must  have  that  /*.„  <  N  <  fk„+i- 

The  main  result  then  follows  by  induction  on  N  >  0. 

Base  Case  (N  =  0):  Trivially  0  =  fk.  ■  fk..  -  fk:i  . i  fko. 

Induction  Step  ( N  >0):  Let  k  be  such  that  fk  <  N  <  fk+i-  Then 
(N  —  fk)  <  fk+i  —  fk  =  fk-i- 

If  N  is  to  be  represented  as  required,  then  by  the  above  result,  fk  must 
be  one  (indeed  the  largest)  of  the  summands. 

But  then  by  the  inductive  hypothesis,  ( N  —  fk)  >  0  can  be  expressed 
uniquely  as 

(N  —  fk)  =  fkl  4-  fki  +  fks  +  •  •  •  +  fk„ 
where  0  <  fci  <  k2  <  k3  <  ■  •  •  <  kn. 

Furthermore,  since  fkn  <  (N  —  fk)  <  fk_lt  we  must  have  that  kn<k—  1, 
i.e.  that  kn  <C  k. 

Taking  kn+1  =  k,  we  thus  get  that  N  is  expressed  uniquely  in  the 
required  form  as 

N  =  fkl  +  fk2  +  fk3  +  •  •  •  +  fk„  +  fkn+1-  □ 
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We  give  here  a  few  examples  illustrating  common  mis-applications  and  mis¬ 
conceptions  of  induction. 


Example  9.16 


Let  T  :  Z  — >  TL  be  the  function  which  is  defined  by: 
n+ 6,  if  n  <  0; 


T(n)  = 


T{T{n— 7)),  otherwise. 
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We  can  show  that  T(n )  =  6  for  all  n  >  0.  To  do  this,  it  is  tempting  to  use 
induction  on  n  as  follows. 

Base  Case  (n  —  0):  T(0)  =  0  +  6  =  6. 

Induction  Step  (n  >  0):  T(n)  =  T(T(n— 7)) 

=  T(6)  (by  inductive  hypothesis) 

=  6  (by  inductive  hypothesis) 

There  are  two  errors  in  the  above  argument.  First  of  all  if  n  <  7  then 
n— 7  <  0  and  the  first  inductive  hypothesis  cannot  be  applied.  Secondly  the 
claim  that  T(6)  =  6  certainly  doesn’t  follow  from  the  inductive  hypothesis 
unless  n  >  6.  These  observations  show  that  to  make  the  induction  work  we 
need  to  verify  a  range  of  base  cases,  namely,  T(n )  =  6  for  0  <  n  <  6. 


Although  the  claim  is  true  in  the  above  Example,  the  argument  presented 
demonstrates  how  easy  it  is  to  make  illegitimate  arguments  to  back  up  a 
claim.  On  the  other  hand,  the  following  exercise  demonstrates  a  blatantly 
false  claim  to  be  true  through  a  seemingly  innocuous  induction  argument. 


Example  9.17 J  Sorites  Paradox _ 

Consider  the  following  “proof”  that  sandpiles  do  not  exist. 

Claim:  For  each  n  >  0,  n  grains  of  sand  do  not  make  a  sandpile. 

Proof:  By  induction  on  n. 

Base  Case  (n  =  0): 

If  there  is  no  sand,  then  there  can  be  no  sandpile. 

Induction  Step:  (n  >  0): 

Suppose  we  have  (n+1)  grains  of  sand  which  constitute  a  sandpile. 
Clearly  taking  away  a  single  grain  of  sand  from  a  sandpile  will  still 
leave  us  with  a  sandpile.  However,  we  will  only  have  n  grains  of  sand 
left,  which  by  induction  does  not  constitute  a  sandpile.  Hence  our 
n+1  grains  of  sand  cannot  constitute  a  sandpile.  □ 


This  is  known  as  the  sorites  paradox  or  the  heap  paradox.  The  name 
comes  from  the  Greek  word  soros  (auipo^)  meaning  “heap".  It  relies  on  the 
vagueness  of  words  such  as  “heap”  and  “pile”  and  has  many  variations,  each 
of  which  being  a  precise  and  accurate  application  of  valid  logical  principles 
to  arrive  at  a  nonsensical  conclusion. 

•  A  man  with  only  1  hair  is  clearly  bald. 


When  Inductions  Go  Wrong  243 


•  If  a  man  with  only  1  hair  is  bald, 
then  a  man  with  only  2  hairs  is  bald. 

•  If  a  man  with  only  2  hairs  is  bald, 
then  a  man  with  only  3  hairs  is  bald. 


•  If  a  man  with  only  9,  999  hairs  is  bald, 
then  a  man  with  only  10,  000  hairs  is  bald. 

Each  of  these  observations  is  precise  and  valid,  yet  chaining  them  all  together 
allows  us  to  conclude  that  a  man  with  10,  000  hairs  on  his  head  is  bald,  a 
wholly  nonsense  claim. 

In  reasoning  about  systems,  it  is  imperative  that  we  use  great  care  to 
employ  only  concepts  that  are  as  rigorously  defined  and  precise  as  the  logical 
means  we  use  to  analyse  them. 


The  sorites  paradox  provides  a  playground  for  philosophers  wanting  to 
debate  the  validity  of  inductive  arguments,  but  relies  heavily  on  ill-defined 
terms  removed  from  the  rigour  of  mathematics.  However,  in  the  next  ex¬ 
ercise  we  provide  a  subtle  error  hidden  in  an  otherwise  air-tight  inductive 
argument  which  leads  to  a  clearly  false  conclusion.  Can  you  uncover  this 
error? 


Exercise  9.17 J  (Solution  on  page  452) _ 

What  is  wrong  with  the  following  “proof”  that  all  people  are  the  same  age? 

We  show,  by  induction  on  n,  that  for  every  collection  5  of  n  >  0  people,  all 

people  in  5  are  the  same  age. 

Base  Case  (n=  0):  Trivially  the  claim  holds  when  5  consists  of  0  people. 

Base  Case  (n  =  1):  Trivially  the  claim  holds  when  5  consists  of  1  person. 

Inductive  Step  (n  >  1):  Assuming  that  the  claim  holds  for  all  collections 
of  size  less  than  n,  we  show  that  it  holds  for  any  collection  of  size  n. 
Let  5  be  a  collection  of  n  people.  Let  S'  and  5"  be  two  overlapping 
collections  of  people  which  together  make  up  5:  5  =  5'  u  5".  By  the 
inductive  assumption,  all  people  in  S'  are  the  same  age,  and  all  people 
in  5"  are  the  same  age.  As  5'  and  5"  overlap,  all  people  in  5  must  be 
the  same  age. 
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9.8;  Examples  of  Induction  in  Computer  Science 

The  following  example  is  typical  of  the  type  of  analysis  which  arises  in  the 
study  of  algorithms. 

(Example  9.18) _ 

Consider  the  following  recursive  algorithm  MinMax(A,  p,  q)  for  calculating 
(x,  y )  where  x  and  y  are,  respectively,  the  minimum  and  maximum  values 
appearing  in  the  array  A[l---n\  between  the  indices  p  and  q,  inclusively 
(the  intention  is  to  initially  call  the  algorithm  with  MinMax(A,  l,n)). 

MinMax(A,p,  q ) 

1  if  p  =  q  then  return  [A[p],  A[p\) 

2  else  if  p  =  q— 1  then 

3  if  A[p]  <  A[q]  then  return  (A[p\,  A[g]) 

4  else  return  (A[g],  A[p\) 

5  else 

6  (minL,  maxL)  :=  MinMax(A,p, p+1) 

7  ( minR,maxR )  :=  MinMax(A,  p+2,  q) 

8  return  (min (mmL,  mmR),  max(maxL,  maxR)^J 

We  are  interested  in  calculating  the  number  of  comparisons  which  this  algo¬ 
rithm  makes,  as  an  indication  of  how  long  it  takes  to  execute  (a  comparison 
is  made  in  line  3,  and  two  are  made  in  line  8  through  the  use  of  the  functions 
min  and  max).  A  simple  analysis  gives  us  that  the  number  T(ra)  of  compar¬ 
isons  made  by  a  call  to  MinMax(A,p,  q)  with  n  =  q—p+1  is  as  follows: 

1.  if  n  =  1,  that  is,  if  p  =  q,  then  the  algorithm  terminates  on  line  1 
without  making  any  comparisons.  Thus  T( 0)  =  0. 

2.  if  n  =  2,  that  is,  if  p  =  q—  1,  then  the  algorithm  terminates  on  line  3 
after  making  one  comparison.  Thus  T( 2)  =  1. 

3.  if  n  >  2  then  the  algorithm  makes 

(a)  T(2)  comparisons  on  line  6;  followed  by 

(b)  T(n— 2)  comparisons  on  line  7;  followed  by 

(c)  2  comparisons  on  line  8 

before  terminating.  Thus  T(n)  =  T( 2)  +  T(n— 2)  +  2  for  all  n  >  2. 
The  inductive  definition  of  T{n)  is  thus  summarised  as  follows. 


T(l)  =  0 

T(  2)  =  1 

T(n )  =  T( 2)  +  T(n— 2)  +  2  (for  n  >  2) 


Examples  of  Induction  in  Computer  Science  245 


Fact  T(n )  =  |"3pj  —  2  (where  [a]  is  x  rounded  up  to  the  nearest  integer.) 
Proof:  By  induction  on  n. 

Base  Case  (n  <  2):  Clearly  the  result  is  true  when  n=  1  or  n=  2. 

Induction  Step  (n  >  2):  Suppose  the  result  is  true  for  all  values  k  <  n 
for  some  n  >  2.  In  particular, 


T{n- 2) 


\3(n- 2) 


-2  = 


'3n' 


-  5. 


Thus 


T(n) 


T(2)  +  T(n— 2)  +  2 

1  +  ([yp]  —  5^)  +  2  ("it/  inductive  hypothesis) 


'3  n 
~T 


-  2. 


□ 


The  next  two  examples  describe  the  technique  of  structural  induction, 
which  is  arguably  the  most  important  variant  of  induction  within  computing. 

(^Example  9.19^ 

Let  A  be  an  alphabet  containing  (at  least)  two  distinct  characters  a  and  b. 
Fact  aw  yt  wb  for  all  words  w  e  A*. 

Proof:  By  induction  on  length( w). 

Base  Case  ( length(w )  =  0):  In  this  case,  we  must  have  that  w  =  e,  so 
aw  —  a  ^  b  =  wb. 

Induction  Step  ( length(w )  >  0):  We  consider  two  subcases,  depending  on 
whether  w  begins  with  the  character  a  or  with  some  other  character  c. 

w  =  au:  Since  length(u )  =  length(w)—l  <  length{w),  au  ub  by 
the  inductive  hypothesis.  Hence 

aw  =  aau  ^  aub  =  wb. 

w  =  cu  (where  c  A  a):  aw  —  acu  ^  cub  =  wb.  □ 


The  above  is  an  example  of  a  proof  based  on  structural  induction:  the 
inductive  hypothesis  assumes  that  the  claim  holds  for  all  smaller  structures 
(in  this  case,  for  all  shorter  words),  and  uses  this  assumption  to  establish 
that  the  claim  holds  for  the  structure  in  question.  For  this  reason,  such 
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a  proof  is  typically  referred  to  as  a  proof  by  induction  on  the  structure  of 
words,  and  would  more  naturally  be  presented  as  follows. 

Proof:  By  induction  on  the  structure  of  words  (that  is,  we  prove  the  result 
for  a  word  w  under  the  inductive  hypothesis  that  it  is  true  for  all  smaller 
words),  arguing  by  cases  on  the  structure  of  w  (that  is,  we  consider  in  turn 
three  possible  forms  of  w,  namely  e,  au  and  cu  where  c  yf  a). 

w  —  e:  aw  =  a  yf  b  —  wb. 

w  =  au:  By  induction  (since  u  is  smaller  than  w ),  au  yf  ub,  so 
aw  =  aau  yf  aub  =  wb. 

w  =  cu  (where  c  ^  a):  aw  =  acu  yf  cub  =  wb. 


We  give  one  further  example,  without  the  excessive  explanations. 

(^Example  9.20^) 

Fact:  Every  binary  tree  t  has  exactly  one  more  leaf  than  internal  node. 

Proof:  By  induction  on  the  structure  of  t,  arguing  by  cases  on  the  structure 
of  t. 

t  —  *:  The  tree  *  has  1  leaf  and  0  internal  nodes. 

t  =  N(U ,  to):  By  induction,  (for  i  =  1,2)  must  have  n,  nodes  and  ns+ 1 
leaves,  for  some  ni,n2.  But  then  iV(ti,<2)  must  have  ni+n2+l  nodes 
and  (nj+1)  +  (n2+l)  =  (nj+n2  +  l)+l  leaves.  □ 


(^Exercise  9.20J)  (Solution  on  page  453) 

Prove  by  induction  that  length(L1-\+ L2)  =  length{L i)  +  length(L2 )  for  all 
lists  Li  and  i2,  using  the  inductive  definition  of  the  length  of  a  list  from 
Example  8.12,  and  your  inductive  definition  of  the  append  function  from 
Exercise  8.13. 


Additional  Exercises 


1.  Prove  the  following  hold  for  all  n  >  0,  by  induction  on  n. 


Additional  Exercises  247 


(a)  l3  +  23  +  33  +  •  •  •  +  ra3  =  (1  +  2  +  3  H - b  ra)2. 

(This  is  known  as  Nicomachus's  Theorem.) 

(b)  l2  +  32  +  52  +  •••  +  (2n— l)2  =  n(2n~1:](2n+1). 

(c)  l3  +  33  +  53  +  •••  +  (2n— l)3  =  n2(2n2  —  1). 

(d)  1-2-3  +  2-3-4  +  3-4-5  +  ••• 

...  +  n(n+l)(n+2)  =  »("+l)("+2)(n+3) 

(e)  1(1!)  +  2(2!)  +  3(3!)  +  •••  +  n(n!)  =  (n+1)!  -  1. 

(f) 


_ 1 _  ,  _ I _  ,  _ 1 _ 

4(12)  -  1  +  4(22)  -  1  +  '4(3')  1 


1  _  n 

4(n2)  —  1  ~  2n  +  1 ' 


2.  Show  that  every  n  >  0  can  be  expressed  uniquely  as 


Ci(l!)  +  c2(2!)  +  c3(3!)  +  •••  +  c„(n!) 
where  0  <  Cj  <  j. 

3.  Prove,  by  induction  on  (m+n),  that  for  all  m,  n  >  0: 

1-2-3 . m  +  2-3-4 . (m+1)  +  3-4-5 . (m+2) 

+  •••  +  n(n+l)(n+2)  •  •  •  (n+m— 1) 

n(n+l)(n+2)(n+3)  •  •  •  (n+m) 

~  m+1  ' 

4.  Prove,  by  induction  on  n,  that  a  finite  set  with  n  elements  has  2" 
subsets. 

5.  Define  the  sequence  (g0,  glt  g2, . . .)  as  follows:  g0  =  0 ;  gi  =  1;  g 2  =  1; 
and  for  all  n  >  2, 

92n-i  =  g2n-i+g2n  and  g2n  =  gl+1-gl_v 
Thus  for  example: 


n= 2  : 

93 

—  9i 

+ 

<■0 
to  to 

9t 

=  g\ 

~  9\ 

n— 3  : 

93 

=  92 

+ 

9  3 

96 

=  94 

1 

to  to 

n— 4  : 

97 

=  9s 

+ 

9l 

9s 

=  9l 

-  g\ 

Show,  by  induction  on  n,  that  gn  =  /„  for  all  n  >  0. 

6.  Define  the  sequence  {x0<xlt  x2, . . .)  as  follows: 

xo  =  0;  xn+1  =  j (n  >  0). 


7. 


Show,  by  induction  on  n,  that  xn 


for  all  n  >  0. 


Provide  a  correct  proof  for  the  claim  made  in  Example  9.16. 


8.  Suppose  that  in  a  particular  country,  every  road  is  one-way,  and  every 
pair  of  cities  is  connected  by  exactly  one  direct  road.  Show,  by  induc¬ 
tion  on  the  number  n  of  cities,  that  there  exists  a  city  which  can  be 
reached  from  every  other  city  either  directly  or  via  only  one  other  city. 
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9.  Imagine  drawing  n  straight  lines  in  the  plane  (extending  to  infinity  in 
both  directions).  The  resulting  configuration  is  to  be  coloured  like  a 
map,  with  no  two  bordering  “countries”  having  the  same  colour  (but 
two  countries  which  meet  at  a  single  point  may  have  the  same  colour). 

Show,  by  induction  on  n,  that  only  two  colours  are  needed. 

(Hint:  Suppose  you  have  such  a  coloured  plane  with  n  lines,  and  you 
draw  a  new  line;  clearly  the  colouring  condition  fails  nowhere  except 
across  this  new  ‘border’.  How  can  you  restore  the  colouring  condition 
without  altering  the  colours  on  one  side  of  this  border?) 

10.  A  collection  of  n  circles  drawn  in  the  plane  divide  the  plane  into  parts. 
Show  that  you  can  colour  the  parts  with  two  colours  so  that  no  two 
parts  with  a  common  boundary  line  are  coloured  the  same  way. 

(Hint:  Similar  to  the  previous  exercise.) 

11.  You  are  given  a  2"  x  2"  checkerboard  with  one  black  square  arbitrarily 
placed  on  the  board  and  the  remaining  4n— 1  squares  white.  You  are 
also  given  a  supply  of  tiles  which  look  like  2x2  checkerboards  with  one 
corner  square  removed.  You  want  to  tile  the  checkerboard  so  that  each 
white  square  is  covered  exactly  once,  while  the  black  square  remains 
uncovered. 

Show,  by  induction  on  n,  that  the  2"  x  2"  checkerboard  can  be  so  tiled, 
for  all  n  >  0. 

(Hint:  For  the  inductive  step,  place  the  first  tile  in 
the  centre  of  the  board  with  the  gap  in  the  quadrant 
containing  the  black  square,  and  look  at  the  four 
2n— 1  x  2n_1  quadrants.) 


12.  You  are  given  a  checkerboard  in  the  shape  of  an  equilateral  triangle 
with  sides  of  length  2"  made  up  of  smaller  equilateral  triangles  with 
sides  of  unit  length.  The  topmost  equilateral  triangle  is  black  but  all 
others  are  white.  You  are  also  given  a  supply  of  tiles  in  the  form  of 
bucket-shaped  trapeziums  made  from  three  small  equilateral  triangles. 


w 


You  want  to  tile  the  large  triangular-shaped  checkerboard  so  that  each 
white  triangle  is  covered  exactly  once,  while  the  black  triangle  remains 
uncovered. 
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Show,  by  induction  on  n,  that  the  whole  checkerboard  can  be  so  tiled, 
for  all  n  >  0. 

13.  There  are  n  identical  cars  on  a  circular  track.  Among  all  of  them,  they 
have  just  enough  petrol  for  one  car  to  complete  a  lap.  Show  that  there 
is  a  car  which  can  complete  a  lap  by  collecting  petrol  from  the  other 
cars  on  its  way  around. 

(Hint:  For  the  induction  step,  first  argue  that  there  is  a  car  A  which 
can  reach  the  next  car  B.  Then  consider  removing  B  from  the  track, 
emptying  its  petrol  into  A.) 

14.  I  put  two  cards  on  a  table  and  tell  two  people  that  the  cards  have 
different  positive  integers  written  on  their  undersides.  I  tell  them  to 
take  one  card  each  at  random  and  secretly  look  at  the  number  written 
on  their  card.  They  are  then  put  in  a  room  with  a  clock  which  rings  a 
bell  every  minute.  They  are  not  allowed  to  communicate  in  any  way, 
but  are  instructed  to  wait  in  the  room  until  one  of  them  knows  which 
card  has  the  lower  number  and  which  has  the  higher  number,  and  then 
to  announce  this  fact  the  next  time  the  clock  rings. 

There  seems  to  be  no  escape  for  these  two  people,  as  there  seems  to 
be  no  way  for  either  of  them  to  discover  who  has  the  larger  number. 
Imagine  being  one  of  the  two,  sitting  with  a  card  with  the  number  26 
written  on  it;  how  could  you  possibly  determine  whether  the  card  held 
by  the  other  person  has  a  number  which  is  smaller  than  this  or  greater 
than  this?  Paradoxically,  it  is  doable. 

Prove,  by  induction  on  n>  1,  that  if  n  is  the  lower  of  the  two  numbers 
written  on  the  two  cards,  then  the  person  who  has  this  card  will  an¬ 
nounce  that  he  has  the  card  with  the  lower  number  after  the  bell  rings 
n  times. 

15.  What  is  wrong  with  the  following  “proof”  that 

1  +  2  +  3  +  •••  +  n  =  (n~1)jn+2). 

Proof:  By  induction  on  n. 

1  T  2  +  3  T  ■  •  •  +  n 

—  (1  +  2  +  3  +  •  •  •  +  ( n — 1))  +  n 

_  (n — 2)(n+l)  ^  inductive  hypothesis) 

_  (n— l)(n+2) 

-  - 2 - '  U 

16.  What  is  wrong  with  the  following  “proof”  that  every  natural  number 
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is  interesting!. 

Proof:  By  induction  on  n. 

Base  Case  (n  —  0):  0  is  interesting  as  it  is  the  smallest  natural 

number. 

Induction  Step:  (n  >  0): 

Suppose  every  number  less  than  n  is  interesting.  If  n  itself  is  in¬ 
teresting  for  some  reason,  then  we  are  done.  On  the  other  hand, 
if  there  is  nothing  interesting  about  n,  then  it  is  in  fact  the  first 
natural  number  which  is  not  interesting,  which  makes  it  an  inter¬ 
esting  number  indeed!  □ 


17.  What  is  wrong  with  the  following  “proof”  that  x=2x  for  all  real  num¬ 
bers  x>0? 

Proof:  By  induction  on  x. 

Base  Case  (x  =  0):  x  —  0  =  2  •  0  =  2x. 

Induction  Step:  (a;  >  0): 

Suppose  y  =  2 y  for  every  positive  real  number  y  less  than  x. 

In  particular,  since  <  x,  ^  =  2(^)  =  x. 

But  then  x  =  2(^)  =  2a;  (by  induction).  □ 


18.  Despite  seeming  more  powerful,  the  principle  of  strong  induction  fol¬ 
lows  from  ordinary  induction,  and  hence  provides  added  convenience 
but  not  added  power.  This  can  be  demonstrated  as  follows. 

Suppose,  for  a  property  P(n)  of  natural  numbers,  the  premise  of  strong 
induction  holds: 

Vn^(Vfc<n  P(k))  =$  P(n)j 

That  is,  P(n )  holds  of  a  particular  value  n  whenever  it  holds  for  all 
smaller  values.  We  will  show,  by  ordinary  induction,  that  VnP(n)  is 
true.  Let  Q(n )  be  the  property  Mk<n  P(k). 

(a)  Show  that  VnP(n)  o  VnQ(n)  without  using  induction. 

(b)  Show  that  VnQ(n)  by  ordinary  induction.  Thus,  by  part  (a), 
Vn  P(n)  . 


''This  proof  is  clearly  wrong,  as  no  number  is  interesting.  Proof:  Suppose  some  numbers 
are  interesting;  then  there  must  be  a  smallest  interesting  number  n.  So  what,  who  cares? 


Chapter  10 

Games  and  Strategies 


You  have  to  learn  the  rules  of  the  game.  And  then  you  have  to  play 
better  than  anyone  else. 

-  Albert  Einstein. 

Games- of -chance  derive  this  title  from  the  fact  that  luck  plays  a  part  in 
deciding  the  winner  of  a  play  of  the  game.  Sometimes  the  game  consists 
solely  of  luck,  as  with  Coin-Flipping  (  “heads  wins”)  or  Card-Cutting 
(  “highest  card  wins”).  Typically,  though,  this  isn’t  the  case,  and  a  sensible 
strategy  is  needed  to  beat  a  good  player  who  isn’t  burdened  by  a  string  of  ex¬ 
traordinarily  bad  deals  of  the  cards  (in  the  case  of,  e.g.,  Poker  or  Bridge), 
or  throws  of  the  dice  (in  the  case  of,  e.g.,  Backgammon  or  Monopoly). 
However,  casinos  operate  (very  successfully!)  on  the  premise  that  (most  of) 
their  clientele  do  not  play  with  luck  on  their  side. 

What  we  might  call  games- of -no- chance  are  those  games  for  which 
the  winner  is  decided  based  solely  on  ability.  Examples  of  such  games  are 
Chess  and  Go.  They  involve  no  decisions  taken  on  the  results  of  random 
events  such  as  the  deal  of  cards  or  the  throw  of  dice,  and  no  information  is 
hidden  from  the  players  (apart  from  what  moves  the  other  player  will  choose 
to  make  during  the  play  of  the  game). 

For  example,  in  the  children’s  paper-and-pencil  game  Noughts  and 
Crosses  (also  known  as  Tic-Tac-Toe),  two  players  alternately  place  crosses 
(x)  and  noughts  (o)  in  nine  square  spaces  arranged  in  a  3  x  3  grid.  The 
goal  of  the  first  player  is  to  align  three  crosses  in  a  line  (row,  column  or 
diagonal),  and  the  goal  of  the  second  player  is  to  align  three  noughts  in  a 
line  (row,  column  or  diagonal).  A  player  wins  the  game  if  they  achieve  their 
goal  before  the  other  player  does  so.  A  game  that  ends  with  a  full  grid 
without  a  line  of  crosses  or  noughts  is  a  draw. 

When  children  first  learn  to  play  this  game,  the  outcomes  will  be  vari¬ 
able;  sometimes  the  first  player  wins,  sometimes  the  second  player  wins, 
and  sometimes  the  game  ends  in  a  draw.  However,  every  child  eventually 
becomes  bored  with  this  game,  as  they  discover  that  they  can  only  win  if 
their  opponent  makes  a  silly  error.  This  is  regardless  of  whether  they  are 
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playing  as  first  player  or  second  player,  though  it  seems  that  the  first  player 
should  have  a  distinct  advantage. 


(lO.l)  Strategies  for  Games-of- No-Chance 

In  this  chapter  we  shall  be  interested  in  such  two  player  games-of-no-chance. 
We  shall  typically  refer  to  the  first  player  as  A  (for  Alice)  for  whom  we  shall 
use  female  pronouns  (she,  her),  and  the  second  player  as  B  (for  Bob)  for 
whom  we  shall  use  male  pronouns  (he,  his).  Furthermore,  these  will  be 
games  of  perfect  information,  meaning  that  both  players  will  be  aware 
of  all  aspects  of  the  game:  at  every  point  in  the  game,  both  players  know 
what  moves  have  been  made  up  to  that  point  in  time,  as  well  as  what  moves 
their  opponent  can  make  in  response  to  any  move  that  they  themselves 
make.  The  game  of  Paper-Scissors-Rock,  for  example,  is  not  a  game  of 
perfect  information,  as  neither  player  has  information  regarding  the  move 
being  made  by  the  other  player.  While  there  is  no  element  of  chance  in 
the  players’  decision  making,  as  each  player  is  free  to  choose  whatever  move 
they  wish,  the  lack  of  information  about  the  opponent’s  move  makes  luck  a 
factor  in  this  game. 

Another  typical  feature  of  the  games  that  we  shall  consider  is  finiteness. 
A  finite  game  is  one  that  is  guaranteed  to  terminate  within  a  finite  number 
of  steps.  This  isn’t  true  of  many  games,  for  example  Chess  (unless  some  rule 
is  introduced  which  declares  a  game  to  be  a  draw  if  it  continues  indefinitely, 
the  standard  rule  being  that  a  draw  is  declared  if  50  consecutive  moves  have 
been  made  without  a  piece  being  captured  nor  a  pawn  being  moved).  If 
a  play  of  a  particular  game  may  continue  indefinitely,  we  will  rule  infinite 
plays  to  be  predetermined  in  some  way;  that  is,  either  the  first  player  wins 
every  infinite  play,  or  the  second  player  wins  every  infinite  play,  or  the  game 
is  declared  to  be  a  draw.  For  example,  we  may  declare  that  every  infinite 
play  of  the  game  of  Chess  is  ruled  to  be  a  draw. 

We  shall  at  times  consider  games  in  which  the  first  player  is  in  the  role 
of  an  attacker;  she  makes  attacking  moves  which  the  second  player,  in  the 
role  of  a  defender,  must  defend  against  with  his  responses.  We  may  refer 
to  such  games  as  attacker- defender  games.  The  first  player’s  aim  is  to 
achieve  some  goal  (which  will  end  the  game),  while  the  second  player’s  aim 
is  to  prevent  her  from  doing  this.  The  important  aspect  of  these  games  is 
that  a  play  which  continues  forever  is  a  positive  result  for  the  second  player; 
that  is,  every  infinite  play  of  an  attacker-defender  game  is  ruled  to  be  a  win 
for  the  second  player. 

A  strategy  for  a  player  in  a  game  is  a  rule  which  tells  that  player  what 
move  to  make  each  time  it  is  their  turn  to  move.  A  strategy  which  guar¬ 
antees  that  you  will  win  the  game  regardless  of  what  moves  your  opponent 
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makes  is  referred  to  as  a  winning  strategy.  If  a  game  may  end  in  a  draw, 
then  a  strategy  which  guarantees  that  your  opponent  will  not  win  (without 
guaranteeing  that  you  will  win)  is  referred  to  as  a  drawing  strategy. 

A  position  in  a  game  is  a  winning  position  if  the  player  whose  turn  it 
is  has  a  winning  strategy  from  this  position;  it  is  a  losing  position  if  the 
other  player  (whose  turn  it  is  not)  has  a  winning  strategy  from  this  position; 
and  finally,  it  is  a  drawing  position  if  neither  player  has  a  winning  strategy 
from  this  position.  Clearly,  from  a  winning  position  there  must  be  a  move 
to  a  losing  position,  while  every  move  from  a  losing  position  must  lead  to  a 
winning  position.  Prom  a  drawing  position  there  must  be  some  move  to  a 
drawing  position,  perhaps  some  moves  to  winning  positions,  but  no  moves 
to  losing  positions. 

For  a  given  game  it  is  not  possible  for  both  players  to  have  a  winning 
strategy,  though  it  is  possible  that  neither  player  has  one.  For  example,  we 
noted  above  that  neither  player  has  a  winning  strategy  in  Noughts  and 
Crosses;  the  game  can  be  played  out  through  the  maximum  nine  moves 
filling  in  all  nine  squares  in  the  grid  without  either  player  winning,  regardless 
of  how  cleverly  they  play.  The  first  player  does  not  have  a  winning  strategy 
because: 

1.  no  matter  what  the  first  player  does 

2.  there  is  something  that  the  second  player  can  do  such  that 

3.  no  matter  what  the  first  player  does 

4.  there  is  something  that  the  second  player  can  do  such  that 

5.  no  matter  what  the  first  player  does 

6.  there  is  something  that  the  second  player  can  do  such  that 

7.  no  matter  what  the  first  player  does 

8.  there  is  something  that  the  second  player  can  do  such  that 

9.  no  matter  what  the  first  player  does 

she  will  not  have  formed  a  line  of  crosses. 

Similarly,  the  second  player  does  not  have  a  winning  strategy  because: 

1.  there  is  something  that  the  first  player  can  do  such  that 

2.  no  matter  what  the  second  player  does 

3.  there  is  something  that  the  first  player  can  do  such  that 

4.  no  matter  what  the  second  player  does 

5.  there  is  something  that  the  first  player  can  do  such  that 

6.  no  matter  what  the  second  player  does 
there  is  something  that  the  first  player  can  do  such  that 


7. 
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8.  no  matter  what  the  second  player  does 

he  will  not  have  formed  a  line  of  noughts. 

The  simplicity  of  this  game  makes  it  easy  to  analyse;  a  play  consists  of 
(at  most)  nine  moves,  and  the  game  is  very  symmetric.  Thus  a  drawing 
strategy  for  both  players  is  easy  to  discover,  which  ultimately  renders  the 
game  uninteresting  to  play. 

All  such  games  are  boring  in  this  sense.  At  most  one  of  the  two  players 
has  a  winning  strategy;  and  by  following  this  strategy,  they  ensure  that  the 
other  player  cannot  do  anything  to  avoid  losing.  If  draws  are  possible,  then 
both  players  may  have  strategies  which  prevent  the  other  from  winning. 
This  important  fact  is  recorded  by  the  following  theorem. 

(^Theorem  1 0 . l) _ 

In  any  two-player  game-of-no-chance  of  perfect  information,  either  one  of  the 
two  players  has  a  winning  strategy,  or  they  both  have  drawing  strategies. 


Proof:  Clearly,  if  one  of  the  two  players  has  a  winning  strategy,  then  the 
other  player  cannot  have  a  winning  strategy:  Fixing  the  strategies  of  the 
two  players,  only  one  of  the  two  players  can  win  the  game,  so  only  one  of 
these  two  strategies  can  be  a  winning  strategy. 

Assume,  then,  that  neither  player  has  a  winning  strategy.  That  the  first 
player  does  not  have  a  winning  strategy  means  that  the  second  player  may 
respond  to  each  move  made  by  the  first  player  in  such  a  way  that  either: 

•  the  game  ends  in  a  draw  or  as  a  win  for  the  second  player;  or 

•  The  game  continues  forever,  and  infinite  games  are  either  ruled  to  be 
draws  or  ruled  to  be  wins  for  the  second  player. 

That  is  to  say,  the  second  player  has  a  strategy  for  ensuring  that  the  first 
player  does  not  win.  Equally,  that  the  second  player  does  not  have  a  win¬ 
ning  strategy  means  that  the  first  player  has  a  strategy  for  ensuring  that 
the  second  player  does  not  win.  Each  of  these  strategies,  therefore,  must  be 
a  drawing  strategy  for  its  associated  player.  □ 


Corollary  10.1 


If  a  game  cannot  end  in  a  draw,  then  one  of  the  two  players  has  a  winning 
strategy. 


(^Example 


Consider  the  following  game:  starting  with  a  pile  of  10  coins,  two  players 
take  turns  removing  either  2  coins  or  3  coins  from  the  pile.  The  player  who 
takes  the  last  coin  wins;  if  one  coin  remains,  then  the  game  is  a  draw. 
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We  can  systematically  analyse  this  game  from  the  end  backwards  as 
follows: 

(1)  If  there  is  1  coin  left,  then  the  game  is  a  draw.  This  is  thus  a  drawing 
position. 

(2)  If  there  are  2  coins  left,  then  you  can  win  the  game  by  taking  both 
coins.  This  is  thus  a  winning  position. 

(3)  If  there  are  3  coins  left,  then  you  can  win  the  game  by  taking  all  three 
coins.  This  is  thus  a  winning  position. 

(4)  If  there  are  4  coins  left,  then  you  can  either: 

-  take  2  coins  and  leave  2,  thus  leaving  the  other  player  in  what  we 
know  from  (2)  above  is  a  winning  position;  or 

—  take  3  coins  and  leave  1,  thus  leaving  the  other  player  in  what  we 
know  from  (1)  above  is  a  drawing  position. 

Clearly  the  latter  option  is  the  correct  one  to  make,  and  this  is  thus  a 
drawing  position. 

(5)  If  there  are  5  coins  left,  then  you  can  either: 

-  take  2  coins  and  leave  3,  thus  leaving  the  other  player  in  what  we 
know  from  (3)  above  is  a  winning  position;  or 

—  take  3  coins  and  leave  2,  again  leaving  the  other  player  in  what 
we  know  from  (2)  above  is  a  winning  position. 

Whatever  you  do  will  leave  the  other  player  in  a  winning  position. 
This  is  thus  a  losing  position. 

(6)  If  there  are  6  coins  left,  then  you  can  either: 

-  take  2  coins  and  leave  4,  thus  leaving  the  other  player  in  what  we 
know  from  (4)  above  is  a  drawing  position;  or 

—  take  3  coins  and  leave  3,  thus  leaving  the  other  player  in  what  we 
know  from  (3)  above  is  a  winning  position. 

Clearly  the  first  option  is  the  correct  one  to  make,  and  this  is  thus  a 
drawing  position. 

(7)  If  there  are  7  coins  left,  then  you  can  take  2  coins  and  leave  5,  which 
we  know  from  (5)  above  is  a  losing  position.  This  is  thus  a  winning 
position. 

(8)  If  there  are  8  coins  left,  then  you  can  take  3  coins  and  leave  5,  which 
we  know  from  (5)  above  is  a  losing  position.  This  is  thus  a  winning 
position. 

(9)  If  there  are  9  coins  left,  then  you  can  either: 
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-  take  2  coins  and  leave  7,  thus  leaving  the  other  player  in  what  we 
know  from  (7)  above  is  a  winning  position;  or 

—  take  3  coins  and  leave  6,  thus  leaving  the  other  player  in  what  we 
know  from  (6)  above  is  a  drawing  position. 

Clearly  the  latter  option  is  the  correct  one  to  make,  and  this  is  thus  a 
drawing  position. 

(10)  If  there  are  10  coins  left  (that  is,  if  you  are  at  the  start  of  the  game), 
then  you  can  either: 

-  take  2  coins  and  leave  8,  thus  leaving  the  other  player  in  what  we 
know  from  (8)  above  is  a  winning  position;  or 

—  take  3  coins  and  leave  7,  again  leaving  the  other  player  in  what 
we  know  from  (7)  above  is  a  winning  position. 

Whatever  you  do  will  leave  the  other  player  in  a  winning  position. 
This  is  thus  a  losing  position. 

We  can  summarise  this  analysis  concisely  in  the  following  table: 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

L 

D 

W 

W 

D 

L 

D 

W 

W 

D 

L 

- 

- 

2 

3 

3 

- 

2 

2 

3 

3 

- 

The  top  row  indicates  the  running  total;  the  middle  row  indicates  whether 
the  current  player  is  in  a  winning  position  (W),  or  a  losing  position  (L),  or 
in  a  drawing  position  (D);  and  the  bottom  row  indicates  how  many  coins 
(2  or  3)  the  player  should  remove  from  the  pile  in  that  turn  (there  being  no 
entry  in  the  cases  where  the  player  is  in  a  losing  position). 

Figure  10.1  depicts  this  game  as  a  so-called  game  tree  for  this  game.  The 
nodes  of  this  tree  represent  positions  in  the  game  (labelled  by  the  number 
of  coins  remaining  in  the  pile),  and  the  arrows  represent  the  possible  moves 
which  a  player  can  make  in  the  given  position  (labelled  by  the  number 
of  coins  removed  by  that  move).  The  winning  positions  are  depicted  by 
circled  nodes,  while  the  losing  positions  are  depicted  by  boxed  nodes;  the 
nodes  which  are  neither  circled  nor  boxed  depict  drawing  positions.  The 
important  observations  to  make  are: 

1.  every  winning  position  has  at  least  one  move  leading  to  a  losing  posi¬ 
tion,  (that  is,  every  circled  node  has  an  arrow  leading  to  a  boxed  node, 
emphasised  in  the  figure  by  a  double  arrow); 

2.  every  move  from  a  losing  position  leads  to  a  winning  position  (that  is, 
every  arrow  from  a  boxed  node  leads  to  a  circled  node);  and 

3.  every  drawing  position  has  a  move  to  another  drawing  position;  pos¬ 
sibly  a  move  to  a  winning  position;  but  no  move  to  a  losing  position 


(that  is,  every  undecorated  node  has  an  arrow  to  another  undecorated 
node,  emphasised  in  the  figure  by  a  double  arrow;  possibly  an  arrow 
to  a  circled  node;  but  no  arrow  to  a  boxed  node). 


These  three  observations  respectively  define  what  it  means  for  a  position  to 
be  a  winning  position,  a  losing  position,  or  a  drawing  position. 


(^Exercise  10. 1/)  (Solution  on  page  453) 

In  the  game  of  Take-3,  there  is  a  single  pile  of  coins,  and  two  players 
alternately  remove  either  1,  2,  or  3  coins  from  the  pile.  The  player  who 
takes  the  last  coin  wins. 

1.  For  each  number  n  from  1  to  10,  explain  who  has  the  winning  strategy 
in  Take-3  starting  from  a  pile  of  n  coins.  In  the  cases  in  which  the 
first  player  has  the  winning  strategy,  state  how  many  coins  (1,  2  or  3) 
the  first  player  should  take. 

2.  Generalise  the  above  by  explaining  who  has  the  winning  strategy  in 
Take-3  starting  from  a  pile  of  n  coins  for  an  arbitrary  n. 

3.  Generalise  the  above  further  by  explaining  who  has  the  winning  strat¬ 
egy  in  Take-A:  starting  from  a  pile  of  n  coins  for  an  arbitrary  n,  but 
where  players  may  alternately  remove  between  1  and  k  coins  (above, 
we  had  k= 3). 
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4.  Misere  Take-3  is  played  exactly  like  Take-3,  but  the  object  of  the 
game  is  to  not  take  the  last  coin;  that  is,  you  wish  to  force  your  oppo¬ 
nent  to  take  the  last  coin.  How  does  this  change  the  above  analysis? 


(^Exercise  10.2^)  (Solution  on  page  454) _ 

In  the  game  of  Misere  Noughts  and  Crosses,  the  aim  is  to  avoid  placing 
three  of  your  symbols  in  a  row,  but  rather  to  force  your  opponent  to  place 
three  of  their  symbols  in  a  row. 

1.  The  first  player  does  not  have  a  winning  strategy  in  this  game.  Explain 
how  the  second  player  can  play  to  avoid  losing. 

(Hint:  It  is  a  good  idea  to  occupy  two  adjacent  side  squares  first,  and 
then  a  square  which  is  aligned  with  only  one  of  these  two  side  squares. 
Why  is  this  possible,  and  why  does  it  work?) 

2.  The  second  player  also  does  not  have  a  winning  strategy  in  this  game. 
Explain  how  the  first  player  can  play  to  avoid  losing. 

(Hint:  Start  by  placing  the  first  cross  in  the  middle,  and  then  “mir¬ 
roring”  every  move  of  the  second  player  by  placing  each  subsequent 
cross  directly  opposite  to  where  the  second  player  places  his  noughts. 
Why  is  this  a  good  idea?) 


Exercise  10.3 


(Solution  on  page  454) 


The  game  of  Clock-2-3  is  played  on  a  board  which  looks  like  the  face  of  a 
12-hour  clock  such  as  depicted  as  follows: 


A  token  is  placed  on  one  of  the  hours  (1  through  12)  and  the  players  take 
turns  moving  the  token  either  2  or  3  hours  forward  (i.e. ,  in  a  clockwise 
fashion).  The  player  who  moves  the  token  onto  the  12  o’clock  slot  wins  the 
game. 
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Explain  who  has  the  winning  strategy  in  Clock-2-3  starting  from  each 
of  the  12  hours.  In  the  cases  in  which  the  first  player  has  the  winning 
strategy,  state  how  many  hours  forward  (2  or  3)  the  first  player  should  move 
the  token.  (As  a  start,  the  first  player  clearly  has  a  winning  strategy  starting 
from  either  9  o’clock  or  10  o’clock,  by  moving  3  and  2  hours,  respectively,  to 
land  on  12  o’clock.  Thus  the  second  player  has  the  winning  strategy  starting 
from  7  o’clock,  as  the  first  player  will  be  forced  to  move  the  token  to  either 
9  o-clock  or  10  o’clock.) 


(^Exercise  10.4)  (Solution  on  page  456) 


The  following  depicts  a  simple  variant  of  the  children’s  board  game  Snakes 
and  Ladders. 


In  this  game,  a  single  shared  counter  is  started  on  square  1,  and  two  players 
take  turns  moving  the  counter  either  one  or  two  spaces  forward  (with  the 
player  moving  deciding  whether  to  move  one  or  two  spaces).  If  the  counter 
lands  at  the  foot  of  a  ladder,  it  climbs  to  the  top  of  the  ladder;  and  if  the 
counter  lands  on  the  head  of  a  snake,  it  slides  down  to  the  tail  of  the  snake. 
The  object  of  this  game  is  to  be  the  one  to  move  the  counter  to  the  final 
square  number  9. 

Identify  which  of  the  positions  are  winning  positions;  which  are  losing 
positions;  and  which  are  drawing  positions.  (Recall  that  a  winning  position 
is  one  from  which  there  is  a  move  to  a  losing  position,  whereas  a  losing 
position  is  one  from  which  every  move  leads  to  a  winning  position;  all  other 
positions  are  drawing  positions,  as  from  these  you  cannot  force  a  win  nor 
be  forced  to  lose.)  As  a  start,  9  is  a  losing  position  in  both  games,  while  8 
is  a  winning  position  in  both  games,  as  you  can  win  by  moving  one  space 
forward.  For  the  non-losing  positions,  indicate  the  optimal  move(s). 


The  CHESS-playing  computer  Deep  Blue  attributes  a  large  part  of  its 
success  in  its  ability  to  search  for  a  winning  strategy  in  a  manner  similar  to 
the  above  analysis.  The  salvation  for  such  games  comes  from  the  fact  that 
there  are  astronomically-many  configurations  to  consider,  far  too  many  for 
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a  modern  (and  indeed  any  conceivable)  computer  to  analyse.  Today’s  Kas¬ 
parovs  are  safe  in  the  fact  that  CHESS-playing  computers  such  as  Deep  Blue 
must  still  invoke  questionable  decision-making  procedures,  but  perhaps  one 
day  a  Very  Deep  Blue  will  render  CHESS-playing,  like  playing  Noughts 
and  Crosses,  a  pointless  activity. 

In  the  rest  of  this  chapter  we  consider  several  moderately-simple  two- 
player  games-of-no-chance,  and  try  to  understand  the  strategies  which  a 
player  should  use  in  order  to  win  them. 


10.2)  Nim 

Nim  is  a  simple  and  ancient  game  played  with  coins,  thought  to  be  Chinese 
in  origin.  To  play  this  game,  an  arbitrary  number  of  piles  of  coins  are 
formed,  each  with  an  arbitrary  number  of  coins  in  them,  and  two  players 
alternate  in  removing  one  or  more  coins  from  any  one  pile.  Whoever  takes 
the  last  coin  is  declared  to  be  the  winner. 

This  game  is  trivial  when  played  with  only  one  or  two  piles  of  coins,  or 
with  three  very  small  piles.  The  analysis  of  the  game  in  these  cases  is  as 
follows. 

1.  In  the  one-pile  game,  the  first  person  has  a  trivial  winning  strategy: 
take  all  the  coins  in  the  first  move. 

2.  In  the  two-pile  game, 

(a)  if  the  piles  contain  an  equal  number  of  coins  then  the  second 
player  has  a  winning  strategy:  always  take  the  same  number  of 
coins  as  the  first  player,  repeatedly  leaving  the  first  player  with 
equal-sized  piles. 

(b)  if  the  piles  contain  an  unequal  number  of  coins,  then  the  first 
player  has  a  winning  strategy:  start  by  taking  coins  from  the 
larger  pile  to  leave  equal-sized  piles,  and  then  use  the  strategy 
described  in  2(a)  for  the  second  player. 

3.  In  the  three-pile  game, 

(a)  if  two  of  the  piles  are  equal,  then  the  first  player  has  a  winning 
strategy:  take  all  of  the  coins  in  the  third  pile,  leaving  just  the  two 
equal-sized  piles  (and  one  empty  pile),  and  then  use  the  strategy 
described  in  2(a)  for  the  second  player. 

(b)  if  the  piles  contain  one,  two,  and  three  coins,  respectively,  then 
the  second  player  has  a  winning  strategy: 
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i.  if  the  first  person  takes  the  whole  of  one  of  the  piles,  then 
there  will  be  just  two  unequal  piles  left  (and  one  empty  pile), 
and  the  second  player  can  win  using  the  strategy  described 
in  2(b)  for  the  first  player. 

ii.  if  the  first  player  takes  just  part  of  one  of  the  piles,  then  there 
will  be  three  non-empty  piles  remaining,  two  of  which  must 
be  equal,  and  the  second  player  can  win  using  the  strategy 
described  in  3(a)  for  the  first  player. 

The  game  is  traditionally  played  with  three  piles,  containing  three,  four, 
and  five  coins,  respectively,  and  even  here  its  complexity  starts  to  become 
convincing;  after  playing  many  times,  it  remains  difficult  to  glean  any  good 
long-term  strategy.  The  only  approach  to  the  game  which  comes  immedi¬ 
ately  to  mind,  reminiscent  of  games  like  Chess,  is  to  look  ahead  several 
moves,  anticipating  the  moves  of  the  other  player,  in  order  to  avoid  bad  po¬ 
sitions.  With  time,  it  is  possible  to  recognise  more  and  more  bad  positions, 
and  become  better  at  avoiding  these.  However,  the  character  of  the  game 
changes  with  four,  five,  or  more  piles. 

In  fact  there  is  a  straightforward  winning  strategy  for  this  game,  either  for 
the  first  player  or  the  second  player,  depending  on  the  number  of  piles  and 
the  number  of  coins  in  each  pile.  To  see  this,  write  out  the  numbers  of  coins 
in  the  piles  in  binary  notation,  one  above  the  other,  and  add  up  the  columns 
modulo  2;  that  is,  compute  the  sum  of  a  column  to  be  0  if  it  has  even  parity 
(i.e. ,  there  are  an  even  number  of  l’s  in  the  column),  and  1  if  it  has  odd 
parity  (i.e.,  there  are  an  odd  number  of  l’s  in  the  column).  If  all  columns 
have  an  even  parity,  we  shall  say  that  the  position  is  balanced',  otherwise 
we  say  that  the  position  is  unbalanced.  The  following  observations  can  be 
made: 


1.  If  every  column  has  even  parity  (i.e.,  we  are  in  a  balanced  position), 
then  every  move  will  result  in  some  column  having  odd  parity  (i.e., 
every  move  leads  to  an  unbalanced  position). 


2.  If  one  or  more  columns  have  odd  parity  (i.e.,  we  are  in  an  unbalanced 


For  example,  in  the  3-4-5-7  game,  the  first  and  third 
columns  have  odd  parity  (while  the  second  column 
has  even  parity).  By  taking  3  coins  from  the  second 
pile,  we  give  the  first  and  third  columns  even  parity 
(while  leaving  the  parity  of  the  second  column  even). 

From  this  new  position,  whatever  coins  are  removed,  there  will  result  at 
least  one  column  with  odd  parity. 


umn  having 

even 

0  1  1 

3: 

0  1  1 

100  = 

=1  1: 

001 

1  0  1 
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1 1 1 
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1  1  1 
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t  t 

000 

With  the  above  two  observations,  along  with  the  insight  that  the  ultimate 
goal  of  the  game  is  to  make  all  columns  add  up  to  zero,  and  hence  an  even 
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number,  it  is  clear  that: 

1.  the  first  player  has  a  winning  strategy  if  one  or  more  of  the  columns 
has  odd  parity:  the  correct  move  is  to  remove  coins  so  as  to  leave  all 
columns  with  even  parity; 

2.  the  second  player  has  a  winning  strategy  if  all  of  the  columns  have 
even  parity:  regardless  of  what  move  is  made,  the  resulting  parity  of 
at  least  one  column  will  be  odd. 


Exercise  10.5J  (Solution  on  page  456) 

If  a  player  is  in  a  winning  position  in  Nim,  then  there  will  in  general  be  more 
than  one  winning  move.  (Two  moves  are  different  if  they  involve  different 
piles,  or  if  they  involve  the  same  pile  but  removing  different  numbers  of 
coins.)  What  is  the  maximum  number  of  different  winning  moves  possible 
from  a  Nim  position  with  n  piles?  Justify  your  answer. 


Exercise  10.6  )  (Solution  on  page  456) 

Suppose  we  change  the  rules  of  Nim  slightly  so  that  the  first  player,  instead 
of  removing  some  coins  from  a  pile,  has  the  additional  option  of  creating  a 
new  pile  of  any  size  (with  at  least  one  coin  in  it);  the  first  player  may  do 
this  at  most  once  during  a  play  of  the  game.  Under  which  circumstances  can 
the  first  player  force  a  win  with  the  help  of  this  extra  move?  (Consider,  in 
particular,  the  two  situations  in  which  the  game  starts  from  an  unbalanced, 
respectively  a  balanced,  position.)  Justify  your  answer. 


~k  (l0.3)  Fibonacci  Nim 

The  next  game  we  consider  is  a  variation  on  Nim  called  Fibonacci  Nim. 
In  this  game  we  have  a  single  pile  containing  n> 2  coins.  The  first  player 
removes  one  or  more  coins  but  not  the  whole  pile.  From  then  on,  the  players 
alternate  moves,  each  person  removing  one  or  more  coins,  but  not  more  than 
twice  as  many  coins  as  the  other  player  has  taken  in  the  preceding  move. 
The  player  who  removes  the  last  coin  wins. 

The  analysis  of  this  game  is  complicated  by  the  fact  that  a  player’s 
available  moves  depend  on  the  opponent’s  last  move.  However,  we  can 
nonetheless  easily  analyse  small  instances  of  this  game: 

2  coins:  the  first  player  must  take  1  coin,  leaving  the  second  player  to  take 
the  last  coin.  Hence  in  this  case,  the  second  player  has  a  (trivial) 
winning  strategy. 
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3  coins:  The  first  player  must  take  1  or  2  coins;  in  either  case  the  second 

player  can  take  all  remaining  coins.  Hence  in  this  case,  the  second 
player  again  has  a  (trivial)  winning  strategy. 

4  coins:  The  first  player  can  take  1  coin,  leaving  the  second  player  to  take 

either  1  or  2  of  the  remaining  3  coins;  the  second  player  can  thus  not 
avoid  losing  as  described  in  the  3-coin  game  for  the  first  player.  Hence 
in  this  case,  the  first  player  has  a  winning  strategy. 

5  coins:  If  the  first  player  takes  more  than  1  coin,  then  the  second  player 

will  be  able  to  take  all  remaining  coins;  thus,  in  order  to  win  the  first 
player  must  take  only  1  coin,  leaving  4  coins.  But  then  the  second 
player  can  win  by  using  the  strategy  described  in  the  4-coin  case  for 
the  first  player.  Hence  in  this  case,  the  second  player  has  a  winning 
strategy. 

6  coins:  The  first  player  can  take  1  coin,  leaving  5  coins  to  the  second 

player.  The  second  player  can  then  not  avoid  losing  as  described  in 
the  5-coin  case  for  the  first  player.  Hence  in  this  case,  the  first  player 
has  a  winning  strategy. 

7  coins:  The  first  player  can  take  2  coins,  leaving  5  coins  to  the  second 

player.  The  second  player  can  then  not  avoid  losing  as  described  in 
the  5-coin  case  for  the  first  player.  Hence  in  this  case,  the  first  player 
has  a  winning  strategy. 

8  coins:  If  the  first  player  takes  more  than  2  coins,  then  the  second  player 

will  be  able  to  take  all  remaining  coins;  thus,  in  order  to  win  the  first 
player  must  take  only  1  or  2  coins,  leaving  either  7  or  6  coins.  The 
second  player  can  then  win  by  using  the  strategy  described  in  the  7- 
coin  or  6-coin  case  for  the  first  player.  Hence  in  this  case,  the  second 
player  has  a  winning  strategy. 

9  coins:  The  first  player  can  take  1  coin,  leaving  8  coins  to  the  second 

player.  The  second  player  can  then  not  avoid  losing  as  described  in 
the  8-coin  case  for  the  first  player.  Hence  in  this  case,  the  first  player 
has  a  winning  strategy. 

We  can  exhaustively  work  out  winning  strategies  this  way,  but  the  reasoning 
is  indeed  exhausting.  It  would  be  a  major  effort,  for  example,  to  work  out 
if  we  have  a  winning  strategy  as  the  first  player  starting  with  100  coins,  and 
if  so  how  many  coins  we  should  take.  There  is,  however,  a  straightforward 
way  to  work  out  who  has  the  winning  strategy,  and  what  the  winning  move 
is  if  one  exists.  To  determine  this,  we  first  recall  the  following. 
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(Theorem  10.(T)  The  Fibonacci  Number  System 

Every  integer  N  >  0  can  be  expressed  uniquely  as  a  sum  of  Fibonacci 
numbers 

N  =  /*!  +  fk2  +  fk3  +  •  •  •  +  fk„ 
where  0  <C  k\  <C  k2  <C  fc3  <C  •  ■  ■  <C  kn.  (Here,  i  <C  j  means  that  i  <  j— 2.) 
For  example,  100  =  3  +  8  +  89  =  fi  +  fe  +  fn- 

Proof:  This  is  Zeckendorf’s  Theorem  which  we  proved  in  Example  9.15. 

□ 


(^Theorem  10. 7j _ 

The  first  player  has  a  winning  strategy  in  Fibonacci  Nim  starting  with 
n  coins  if,  and  only  if,  n  is  not  a  Fibonacci  number.  In  this  case,  the 
winning  strategy,  when  n  coins  remain,  is  always  to  take  fkl  coins,  where 
n  =  fki+fin+  •  •  •  +fkr  (with  0 ■  •  ■  <CfcrJ  is  the  representation  of  n 
in  the  Fibonacci  number  system. 

For  example,  in  the  game  starting  with  100  =  /4  +  /6  +  /11  coins,  the  winning 
opening  move  is  to  take  /4  =  3  coins. 


Exercise  10.7 J  (Solution  on  page  457) 
Prove  Theorem  10.7. 


(l0.4)  Chomp 

In  the  game  of  Chomp,  we  have  an  mxn  chocolate  bar,  in  which  the 
leftmost-topmost  square  (1, 1)  is  poisonous.  Two  players  take  turns  taking 
bites  out  of  the  chocolate  bar,  with  each  player  having  to  choose  a  remaining 
square  and  eat  it  along  with  all  remaining  squares  below  and  to  the  right. 
The  goal  is  to  force  the  other  player  to  eat  the  poisonous  square. 

As  before,  we  can  easily  analyse  small  instances  of  this  game. 

1.  In  the  lxl  case,  the  first  player  loses  right  away;  hence  the  second 
player  has  a  trivial  winning  strategy. 

2.  In  the  lxn  case  with  n>l  (or,  similarly,  the  mxl  case  with  m>l),  the 
first  player  has  a  trivial  winning  strategy:  bite  off  all  but  the  poisonous 
square,  leaving  just  the  poisonous  square  for  the  second  player  to  take. 
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3.  In  the  2 xra  (or  mx 2)  case,  the  first  player  has  a  simple  winning 
strategy:  bite  off  one  square,  leaving  a  2xn  rectangle  with  the  bottom- 
right  square  missing. 

(a)  if  the  remaining  chocolate  is  a  2  xk  rectangle  with  the  bottom- 
right  square  missing,  then  every  move  will  result  in  a  shape  dif¬ 
ferent  from  this. 

(b)  if  the  remaining  chocolate  is  not  in  the  shape  of  a  2 xk  rectangle 
with  only  the  bottom-right  square  missing,  then  some  move  will 
result  in  a  shape  of  this  form. 

With  this  observation,  along  with  the  insight  that  the  ultimate  goal  of 
the  game  is  to  leave  just  the  poisonous  square  which  has  the  shape  of 
a  2x1  rectangle  with  the  bottom-right  square  missing,  it  is  clear  that 
the  first  person  has  a  winning  strategy. 

4.  In  the  nxn  (square)  case,  the  first  player  again  has  a  simple  winning 
strategy:  bite  off  the  (n— l)x(n— 1)  sub-square,  leaving  just  the  top 
row  and  left  column.  From  here,  just  mimic  every  move  of  the  second 
player,  biting  off  as  many  squares  from  the  row  (respectively,  column) 
as  the  second  player  bites  off  the  column  (respectively,  row). 

Apart  from  these  special  cases,  very  little  is  known  about  winning  strategies 
in  this  game.  The  only  way  to  find  the  winning  strategy  is  to  explore 
moves,  and  responses  to  moves,  and  responses  to  responses  to  moves,  etc. 
For  example  consider  the  3x4  game.  In  this  case,  it  is  not  a  good  idea  to 
take  just  one  square  (as  was  the  strategy  in  case  3  above),  nor  to  take  all  but 
the  first  row  and  first  column  (as  was  the  strategy  in  case  4  above).  However, 
the  first  player  does  have  a  winning  strategy,  which  starts  by  biting  off  a 
2x2  square.  This  leaves  the  second  player  with  7  moves  to  choose  from; 
whatever  move  the  second  person  takes,  though,  will  be  bad,  as  can  be  seen 
in  Figure  10.2. 

Despite  the  difficulty  of  this  game,  we  can  easily  prove  the  following 
remarkable  fact. 

(^Theorem  10.8^) 

Except  for  the  degenerate  lxl  case,  the  first  player  always  has  a  winning 
strategy. 


Proof:  Suppose,  for  the  sake  of  argument,  that  the  second  player  has  a 
winning  strategy.  This  means,  in  particular,  that  whatever  move  the  first 
person  opens  the  game  with,  the  second  person  has  a  response  which  will 
leave  the  chocolate  in  a  configuration  from  which  the  first  person  cannot 
win. 

Consider  the  response  that  the  second  person  makes  using  this  winning 
strategy  if  the  first  person  opens  by  biting  off  just  a  single  square.  Whatever 
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In  3x4  Chomp  the  first 
player  has  a  winning  strat¬ 
egy,  in  which  the  opening 
move  is  to  bite  off  a  2x2 
square. 


The  second  player  has  7 
possible  responses  to  this 
move,  but  every  one  of 
them  is  bad:  the  first 
player  has  a  response  to 
each  of  these  which  will 
bring  victory  closer. 


^x|x|x 


E 


Figure  10.2:  An  analysis  of  3x4  Chomp. 


this  response,  it  is  a  move  which  the  first  player  could  equally  have  opened 
the  game  with,  thus  leaving  the  second  player  to  play  from  the  losing  con¬ 
figuration. 

This  contradicts  the  assumption  that  the  second  player  has  the  winning 
strategy.  □ 

This  is  indeed  an  interesting  state  of  affairs.  In  this  game,  we  know 
that  the  first  player  has  a  winning  strategy,  but  apart  from  exhaustively 
analysing  all  possible  plays  of  the  game,  there  is  no  way  of  knowing  how  to 
win  as  the  first  player. 


10.5)  Hex 

The  game  of  Hex  is  played  on  a  board  consisting  of  an  nxn  grid  of  hexagons, 
as  shown  in  Figure  10.3.  At  the  beginning  of  the  game,  the  first  player  is 
considered  to  own  the  territories  to  the  North-East  and  South-West  of  the 
board  (the  two  sides  labelled  with  crosses  x  in  the  figure),  while  the  second 
player  is  considered  to  own  the  territories  to  the  North-West  and  South-East 
of  the  board  (the  two  sides  labelled  with  noughts  o  in  the  figure).  The  object 
of  the  game  for  each  player  is  to  create  a  path  through  the  board  joining 
their  disconnected  territories.  The  players  alternate  moves;  the  first  player 
places  a  cross  x  in  a  vacant  hexagon,  and  the  second  player  follows  on  by 
placing  a  nought  o  in  a  vacant  hexagon.  The  winner  of  the  game  is  the  first 
player  to  connect  their  two  sides  of  the  board  with  a  contiguous  chain  of 
hexagons  labelled  with  their  symbol. 
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(^Theorem  1 0 . 9j 

The  game  of  Hex  can  never  end  in  a  draw. 


Proof:  An  informal  argument  runs  as  follows.  Think  of  the  crosses  as  land 
and  the  circles  as  water;  when  all  the  hexagons  are  labelled,  either  there  is  an 
isthmus  connecting  the  two  continents,  or  else  water  flows  between  the  two 
oceans,  In  the  first  case,  the  first  player  has  a  winning  chain  of  x-labelled 
hexagons,  while  in  the  latter  case  the  second  player  has  a  winning  chain  of 
o-labelled  hexagons. 

A  formal  proof  would  require  a  fair  amount  of  explanation;  here  we  pro¬ 
vide  only  an  outline.  Assuming  that  every  hexagon  is  labelled  with  a  cross 
x  or  a  nought  o,  we  show  that  one  of  the  opposite  pairs  of  sides  is  con¬ 
nected  in  a  winning  fashion.  To  see  this,  we  imagine  tracing  a  path  along 
the  boundaries  of  the  hexagons,  entering  the  grid  at  the  left-most  corner.  At 
each  junction  we  look  at  the  territory  we  are  facing;  if  it  is  labelled  x  then 
we  turn  left,  and  if  it  is  labelled  o  then  we  turn  right.  If  we  do  this,  then 
we  shall  trace  a  path  which  always  has  x  on  its  right  and  o  on  its  left,  and 
the  path  will  exit  the  grid  either  at  the  top  or  at  the  bottom.  In  the  first 
case,  the  x-hexagons  to  the  right  of  the  path  include  a  winning  path  for  the 
first  player,  and  in  the  second  case,  the  o-hexagons  to  the  left  of  the  path 
include  a  winning  path  for  the  second  player.  Figure  10.4  gives  an  example 
in  which  the  second  player  has  a  winning  chain.  □ 

Knowing  that  this  game  can  never  ends  in  a  draw,  we  can  then  prove 
the  following. 
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(^Theorem  lO.lcP) 

The  first  player  always  has  a  winning  strategy  for  Hex. 


Proof:  Suppose  for  the  purpose  of  argument  that  the  second  player  has  a 
winning  strategy.  Then  the  first  player  may  play  as  follows. 

1.  She  may  label  any  hexagon  chosen  at  random,  and  then  forget  that 
she  has  done  this. 

2.  She  may  then  pretend  from  this  point  on  that  she  is  playing  the  game 
as  the  second  player,  using  the  (supposed)  winning  strategy  for  the 
second  player. 

3.  If  at  any  time  this  strategy  dictates  that  she  should  label  the  pre¬ 
labelled  hexagon,  then  she  should  simply  label  any  other  unlabelled 
hexagon  at  random,  pretend  that  it  isn’t  labelled,  and  pretend  that 
her  response  was  to  label  the  pre-labelled  hexagon. 

In  this  fashion,  the  first  player  is  stealing  the  winning  strategy  from  the 
second  player,  and  using  it  to  win  the  game.  This  proves  that  if  the  second 
player  has  a  winning  strategy,  then  the  first  player  has  a  winning  strategy, 
which  of  course  is  a  contradiction.  □ 

Again,  as  with  Chomp,  we  are  able  to  prove  that  the  first  player  has  the 
winning  strategy  in  Hex,  but  our  proof  gives  no  indication  as  to  what  that 
strategy  might  be! 
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(HL6)  Bridg-lt 

The  game  of  Bridg-It  is  similar  to  Hex,  but  is  played  on  a  staggered  nxn 
board  as  depicted  in  Figure  10.5.  The  goal  of  the  first  player  is  to  link 
the  left-  and  right-hand  borders,  while  the  goal  of  the  second  player  is  to 
link  the  top  and  bottom  borders.  The  two  players  alternate  moves;  the 
first  player  joins  two  neighbouring  spots  •  either  horizontally  or  vertically, 
and  the  second  player  joins  two  neighbouring  circles  o  either  horizontally 
or  vertically.  Neither  player  can  cross  a  link  previously  made  by  the  other 
player. 

(^Theorem  10.11?) _ 

The  game  of  Bridg-It  can  never  end  in  a  draw. 


Proof:  Assuming  that  no  further  moves  can  be  made,  the  board  will  depict 
a  simple  maze  pattern,  such  as  that  given  in  Figure  10.6.  Entering  the  maze 
from  the  bottom  left,  there  is  a  unique  path  through  the  maze,  which  always 
has  the  first  player’s  «-links  on  its  left  and  the  second  player’s  o-links  on  its 
right.  This  path  must  exit  the  maze  at  either  the  bottom  right  or  the  top 
left.  (The  path  cannot  exit  the  maze  at  the  top  right,  as  then  it  would  end 
with  o-links  on  its  left  and  «-links  on  its  right.)  In  the  first  case,  the  «-links 
to  the  left  of  the  path  contain  a  winning  path  for  the  first  player,  and  in  the 
second  case,  the  o-links  to  the  right  of  the  path  contain  a  winning  path  for 
the  second  player.  □ 
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Figure  10.6:  Bridg-It  never  ends  in  a  Draw. 


Knowing  that  this  game  can  never  ends  in  a  draw,  we  can  then  prove 
the  following. 

(^Theorem  10.1 2^) _ 

The  first  player  always  has  a  winning  strategy  for  Bridg-It. 


Proof:  The  reasoning  is  identical  to  that  used  in  the  proof  of  the  analogous 
result  for  Hex.  □ 

Yet  again  this  proves  that  the  first  player  has  a  winning  strategy  without 
giving  any  indication  as  to  what  that  strategy  might  be.  However,  in  this 
case,  we  can  explicitly  describe  a  winning  strategy  for  the  first  player.  Re¬ 
ferring  to  Figure  10.7,  the  first  player  should  open  with  the  link  indicated. 
From  that  point  onwards,  each  link  that  the  second  player  makes  will  touch 
the  end  of  one  of  the  dotted  lines  depicted  in  Figure  10.7;  in  response,  the 
first  player  should  add  the  link  which  touches  the  other  end  of  this  dotted 
line.  In  this  way,  the  first  player  will  successfully  block  any  attempt  by 
the  second  player  to  create  a  path  linking  the  top  and  bottom  borders,  and 
hence  she  will  herself  eventually  win. 

(^Exercise  10.12^)  (Solution  on  page  459) 

Argue  that  the  above  does  indeed  describe  a  winning  strategy  for  the  first 
player. 
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(l0.7)  Additional  Exercises 

1.  Consider  the  following  game: 

Starting  with  5  coins,  two  players  take  turns  taking  1  or  2  coins; 
and  whoever  ends  up  with  an  odd  number  of  coins  wins. 

(a)  Draw  the  complete  game  tree  for  this  game,  and  determine  who 
has  the  winning  strategy. 

(b)  Who  has  the  winning  strategy  in  this  game  when  started  with 
n  coins,  where  n  is  an  arbitrary  odd  number?  (Hint:  for  each 
n  =  1,2,3,...,  determine  whether  or  not  you  have  a  winning 
move  if  it  is  your  turn  and  there  are  n  coins  left,  and  you  are 
currently  holding  an  even  number  of  coins;  in  parallel  to  this, 
determine  whether  or  not  there  is  a  winning  move  if  it  is  your 
turn  and  there  are  n  coins  left,  and  you  are  currently  holding  an 
odd  number  of  coins.  Look  for  a  pattern.) 

2.  Consider  the  following  game: 

Starting  with  5  coins,  each  player  takes  turns  taking  1,  2  or  3 
coins;  and  whoever  ends  up  with  an  odd  number  of  coins  wins. 

(a)  Draw  the  complete  game  tree  for  this  game,  and  determine  who 
has  the  winning  strategy. 
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(b)  Who  has  the  winning  strategy  in  this  game  when  started  with  n 
coins,  where  n  is  an  arbitrary  odd  number?  (The  same  hint  as 
for  question  1(b)  applies.) 

3.  Consider  the  following  game: 

Starting  with  a  single  pile  of  coins,  two  players  alternate  taking 
either  1  coin  or  half  of  the  remaining  coins,  including  the  leftover 
coin  if  there  is  an  odd  number  of  coins  remaining.  Thus,  for 
example,  if  there  are  25  coins  in  the  pile  then  a  move  consists  of 
taking  either  1  coin  or  13  coins;  if  13  coins  are  taken  leaving  12 
in  the  pile,  then  the  next  move  will  consist  of  taking  either  1  coin 
or  6  coins.  The  player  who  takes  the  last  coin  wins. 

(a)  For  each  number  n  from  1  to  10,  explain  who  has  the  winning 
strategy  in  this  game  starting  from  a  pile  of  n  coins.  In  the  cases 
in  which  the  first  player  has  the  winning  strategy,  state  how  many 
coins  the  first  player  should  take. 

(b)  Argue  that  the  first  player  has  a  winning  strategy  in  the  game 
starting  with  n  coins  if,  and  only  if,  the  binary  representation  of 
n  ends  in  an  even  number  of  0’s.  Specifically, 

•  if  the  binary  representation  of  n  ends  in  an  even  number  of 
0’s,  then  either  n=  1  and  you  can  win  by  taking  the  single 
coin,  or  there  is  a  move  which  leaves  a  number  of  coins  whose 
binary  representation  ends  in  an  odd  number  of  0’s;  and 

•  if  the  binary  representation  of  n  ends  in  an  odd  number  of 
0’s,  then  every  move  leaves  a  number  of  coins  whose  binary 
representation  ends  in  an  even  number  of  0’s. 

4.  Consider  the  following  game: 

Starting  with  a  pile  of  n  coins,  two  players  alternately  remove  a 
number  of  coins  which  is  a  power  of  2.  That  is,  a  player  may  take 
1  coin,  or  2  coins,  or  4  coins,  or  8  coins,  or  16  coins,  or  2k  coins 
for  any  k.  The  player  who  takes  the  last  coin  wins. 

Argue  that  the  second  player  has  a  winning  strategy  if,  and  only  if,  n 
is  a  multiple  of  3. 

5.  Consider  the  following  game: 

Starting  with  a  pile  of  n  coins,  two  players  alternately  remove 
either  1  or  3  or  8  coins.  The  player  who  takes  the  last  coin  wins. 

Argue  that  the  second  player  has  a  winning  strategy  if,  and  only  if,  n 
is  of  the  form  11  k  or  llfc+2  or  llfc+4  or  llfc+6. 
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6.  (a)  The  game  of  Clock-1-3  is  identical  to  the  game  of  Clock-2- 

3  from  Exercise  10.3  (page  258)  except  in  this  game  the  token 
moves  either  1  or  3  hours  forward. 

Work  out  who  has  the  winning  strategy  in  the  game  of  Clock- 
1-3  starting  from  each  of  the  12  hours.  In  the  cases  in  which 
the  first  player  has  the  winning  strategy,  state  how  many  hours 
forward  (1  or  3)  the  first  player  should  move  the  token. 

(b)  The  game  of  Clock-1-4  is  identical  to  the  game  of  Clock-2- 
3  from  Exercise  10.3  (page  258)  except  in  this  game  the  token 
moves  either  1  or  4  hours  forward. 

Work  out  who  has  the  winning  strategy  in  the  game  of  Clock- 
1-4  starting  from  each  of  the  12  hours.  In  the  cases  in  which 
the  first  player  has  the  winning  strategy,  state  how  many  hours 
forward  (1  or  4)  the  first  player  should  move  the  token. 

7.  (a)  The  game  of  Days-of-the-Year  is  played  by  two  players  who 

take  turns  naming  a  date  of  the  year  starting  from  January  1st. 
On  any  move  a  player  may  increase  the  month  or  the  day  but  not 
both.  Thus,  for  example,  the  first  player  can  start  the  game  by 
naming  any  day  in  January  (apart  from  the  1st),  or  the  1st  of  any 
month  of  the  year  (apart  from  January).  The  player  who  names 
December  31st  wins. 

Work  out  who  has  the  winning  strategy  for  this  game.  For  a 
start,  you  can  note  that  there  is  a  winning  move  from  any  date  in 
December  (apart  from  the  31st),  as  well  as  from  the  31st  of  any 
month  (apart  from  December). 

(b)  The  game  of  Days-of-the-Century  is  played  by  two  players 
who  take  turns  naming  a  date  in  the  20th  century  starting  from 
1st  January  1900.  On  any  move  a  player  may  increase  the  month 
or  the  day  or  the  year,  but  only  one  of  these  three.  The  player 
who  names  31st  December  1999  wins. 

Work  out  who  has  the  winning  strategy  for  this  game. 

8.  In  Misere  Noughts  and  Crosses,  it  would  seem  sensible  to  avoid 
placing  the  first  cross  in  the  centre,  as  the  centre  is  involved  in  the 
most  winning  lines.  However,  as  the  Hint  in  Exercise  10.2  suggests,  a 
sensible  opening  move  in  Misere  Noughts  and  Crosses  is  to  place 
a  cross  in  the  centre. 

This  is  in  fact  the  only  sensible  opening  move.  Suppose  that  the  first 
player  starts  by  placing  a  cross  somewhere  other  than  the  centre,  that 
is,  in  a  corner  or  side  square.  Show  that  the  second  player  has  a 
winning  strategy  from  this  position. 
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9.  The  following  depicts  a  simple  variant  of  the  children’s  board  game 
Snakes  and  Ladders. 


The  rules  of  the  game  are  as  described  in  Exercise  10.4  (page  259). 

Identify  which  of  the  positions  are  winning  positions;  which  are  losing 
positions;  and  which  are  drawing  positions.  (The  game  is  played  as 
described  in  Exercise  10.4  on  page  259.)  For  the  non-losing  positions, 
indicate  the  optimal  move(s). 

10.  Who  has  the  winning  strategy  in  Nim  when  you  start  with  n  piles  each 
containing  an  equal  number  of  coins?  Justify  your  answer  without 
referring  to  the  general  theory  of  Nim,  that  is,  without  referring  to 
balanced  versus  unbalanced  positions. 

11.  In  Misere  Nim,  the  objective  is  to  not  take  the  last  coin.  What  is 
the  winning  strategy  for  this  variation? 

12.  The  game  of  Nim-A:  is  played  just  like  Nim  except  that  in  a  single 
move  a  player  can  remove  (a  different  number  of)  coins  from  up  to  k 
different  piles.  Thus  Nim-1  is  the  usual  game  of  Nim. 

Prove  that  there  is  a  winning  strategy  in  Nim-A:  from  a  given  collec¬ 
tion  of  piles  if,  and  only  if,  when  writing  out  the  numbers  of  coins 
in  the  piles  in  binary  notation,  one  above  the  other,  and  adding  up 
the  columns,  the  sum  of  at  least  one  of  the  columns  is  not  divisible 
by  k+1. 

13.  Does  the  first  player  have  any  other  safe  opening  moves  in  3x4  Chomp 
apart  from  the  one  outlined  in  Figure  10.2? 

14.  What  are  the  possible  safe  opening  moves  in  3x5  Chomp? 

15.  In  this  exercise  we  use  a  simple  game  to  prove  the  result  from  Exam¬ 
ple  6.16  that  the  set  [0, 1]  =  {x  :  0  <  x  <  1 }  is  uncountable. 

In  this  game,  a  subset  5  C  [0, 1]  of  real  numbers  between  0  and  1  is 
fixed,  and  the  two  players  A  and  B  take  turns  choosing  real  numbers 
ao,bo,aubua2,b2, . . .  -  with  A  choosing  the  a;s  and  B  choosing  the  i>jS 
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-  starting  from  a0  =  0  and  b0  =  1.  When  choosing  ait  A  must  choose  a 
value  satisfying  ai_1  <  a;  <  b^;  and  when  choosing  bit  B  must  choose 
a  value  satisfying  a;  <  bt  <  i.  That  is,  j4  starts  at  0  and  B  starts 
at  1,  and  they  take  turns  moving  towards  -  but  never  reaching  -  the 
other. 

The  increasing  sequence  a0l  alt  a2l  ■  ■  ■  which  A  is  choosing  must  con¬ 
verge  towards  a  limit  value  a;  that  is,  a  is  the  smallest  real  value  which 
is  bigger  than  every  a8.  If  a  6  S  then  A  wins  the  game;  otherwise  B 
wins  the  game. 

(a)  Prove  that  if  5  is  countable  then  B  has  a  winning  strategy  in 
this  game.  (Hint:  Given  an  enumeration  Si,S2,s3,...  of  5,  B’s 
winning  strategy  is  to  choose  i>;  =  s;  whenever  possible.) 

(b)  Deduce  from  the  above  that  [0, 1]  is  uncountable.  (Hint:  A  clearly 
has  the  winning  strategy  when  5  =  [0, 1].) 

16.  This  exercise  exposes  a  paradox  devised  by  the  mathematician  William 
Zwicker. 

Professor  Bertrand  likes  every  game  which  can  never  be  played  for¬ 
ever;  and  he  hates  any  game  which  may  potentially  go  on  forever.  For 
example,  he  likes  Nim,  but  he  hates  lawn  tennis,  as  it  could  potentially 
get  into  an  infinite  “advantage-deuce”  cycle. 

Consider  the  game  of  Russell  whose  rules  are  as  follows: 

•  The  first  player  chooses  any  game  that  must  terminate. 

•  The  two  players  play  the  chosen  game,  with  the  second  player 
making  the  first  move  in  the  chosen  game. 

Does  Professor  Bertrand  like  the  game  of  Russell?  Explain. 
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Modelling  Processes 


If  you  can’t  describe  what  you  are  doing  as  a  process,  then  you  don’t 
know  what  you’re  doing. 

-  W.  Edwards  Deming. 


Having  mastered  the  basic  mathematical  machinery  presented  in  the  first 
part  of  this  book  for  modelling  computing  systems,  the  first  question  we 
must  then  address  is:  What  exactly  do  we  mean  by  a  computing  system? 
We  are  not  speaking  here  of  the  various  hardware  components  of  a  digital 
computer  -  as  Edsger  W.  Dijkstra  noted,  computer  science  is  no  more  about 
computers  than  astronomy  is  about  telescopes.  Rather  what  we  have  in 
mind  is  any  computational  process.  Roughly  speaking,  a  process  describes 
the  behaviour  of  a  system  as  performing  various  actions  that  change  the 
system’s  state.  These  changes  are  controlled  by  a  set  of  rules  which  depend 
only  on  the  state  of  the  system  and  the  state  of  its  environment. 


For  example,  if  it  is  raining  and  we  are  outside  holding  a  closed  umbrella, 
then  we  should  perform  the  action  of  opening  the  umbrella.  This  doesn’t 
change  the  state  of  the  environment  -  it  continues  to  rain  -  but  it  changes 
our  state;  we  are  now  under  the  protection  of  the  umbrella.  The  rules  that 
we  abide  by  might  then  stipulate  that  we  should  close  our  umbrella  once 
again  if  and  when  the  rain  stops,  or  when  we  enter  a  building. 


As  another  example  which  has  a  more  computational  flavour,  consider 
the  simple  calculator  in  Figure  11.1.  The  actions  which  may  be  performed 
are  button-presses,  which  may  change  the  state  of  the  calculator  -  most 
obviously  by  changing  the  display,  but  also  by  changing  the  internal  state 
of  the  calculator.  Of  course,  if  the  calculator  is  off,  then  the  only  action 
which  has  any  effect  is  pressing  the  button  which  starts  the  calculator 
in  its  initial  state;  and  at  any  time  when  the  calculator  is  on,  this  button 
can  be  pressed  to  turn  it  off,  or  the  “clear”  button  [c]  can  be  pressed  to  put 
the  calculator  into  its  initial  state.  Thus,  pressing  the  [c]  button  when  the 
calculator  is  on  has  the  same  effect  as  pressing  the 


button  twice. 


Consider  carrying  out  a  simple  calculation  such  as  123  -f-  45  using  the 
following  sequence  of  button  presses  (starting  from  the  initial  state  of  the 
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Figure  11.1:  A  simple  calculator. 


calculator): 


As  you  press  the  first  three  numeric  buttons,  the  calculator  simply  accumu¬ 
lates  these  digits,  displaying  the  sequence  of  digits  as  it  increases  in  length. 
When  you  then  press  the  [T]  button,  the  calculator  stores  the  number  123 
in  its  memory,  along  with  the  operation  of  division,  and  awaits  the  entry  of 
a  second  number  made  up  from  a  further  sequence  of  digits,  in  this  instance 
the  number  45.  Pressing  the  Q  button  tells  the  calculator  that  the  sec¬ 
ond  number  has  been  completely  entered,  and  that  the  operation  in  its 
memory  should  be  applied  between  the  first  number  123  that  it  has  stored 
in  its  memory  and  this  second  number  45.  The  calculator  will  respond  by 
displaying  the  value  2.7333333. 

There  are  many  design  decisions  which  must  be  made  when  describing 
the  behaviour  of  a  calculator,  though  for  such  a  simple  calculator  as  above 
most  decisions  are  widely-accepted.  For  example,  the  sequence  of  button 
presses 


1 


is  virtually  universally  accepted  to  mean  123  4-45,  recognising  that  the  user 
inadvertently  pressed  the  [x]  button  and  corrected  this  by  subsequently 
pressing  the  [T]  button  to  “overwrite”  the  operation^  However,  we  have 

iThis  interpretation  is  generally,  though  less  universally,  accepted  in  the  instance  when 
the  user  presses  after  another  operator  button;  the  correct  sequence  of  button-presses 
for  calculating  123  x  (—45)  is  of  course  [T]  |  2  |  3  x  4  5  V  |  -  . 
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barely  started  describing  the  behaviour  of  this  calculator.  To  specify  its 
complete  behaviour  as  above  would  require  many  pages.  It  is  not  uncom¬ 
mon  for  realistic  -  yet  nonetheless  modest  -  systems  to  have  specification 
documents  running  into  several  hundred  pages. 

Expressing  the  whole  behaviour  of  a  system  in  English  prose  as  above 
quickly  becomes  a  tedious,  lengthy,  and  extremely  error-prone  activity.  We 
clearly  need  a  formal  framework  with  which  to  describe  the  behaviour  of 
such  processes,  as  well  as  a  language  for  expressing  them.  Also,  we  need 
an  understanding  as  to  when  such  a  process  is  correct,  that  is,  exhibits  the 
behaviour  that  we  expect  (i.e. ,  specify)  that  it  should.  These  are  all  the 
concerns  of  the  present  chapter. 


(lU)  Labelled  Transition  Systems 

In  considering  how  we  might  wish  to  view  a  computational  process,  we  can 
identify  various  of  its  underlying  properties.  Firstly,  at  any  given  moment 
in  time,  the  process  will  be  in  a  specific  state.  Secondly,  in  a  given  state, 
certain  events  or  actions  may  happen  which  will  cause  the  process  to  evolve 
into  a  new  state.  In  fact,  a  state  of  a  process  may  be  completely  determined 
by  what  actions  may  occur  in  that  state,  and  to  what  new  states  each  action 
might  lead. 

As  a  very  simple  example,  consider  a  light  switch  that  is  either  in  the 
“off”  position  and  may  be  switched  on,  or  is  in  the  “on”  position  and  may 
be  switched  off.  At  any  given  moment  in  time  the  system  (ie,  the  light)  will 
be  in  one  of  two  states,  which  we  might  refer  to  as  Off  and  On.  In  the  Off 
state  you  can  turn  the  light  on  (ie,  do  an  on  action)  to  take  the  system  to 
the  On  state,  whereas  in  the  On  state  you  can  turn  the  light  off  (ie,  do  an 
off  action)  to  take  the  system  to  the  Off  state.  We  can  picture  this  simple 
system  as  follows: 


Off 


Here,  the  two  states  of  the  system  are  represented  by  circles,  and  there 
are  arrows  leading  from  one  state  to  another;  each  arrow  is  labelled  by  the 
action  which  causes  the  process  to  make  a  transition  from  one  state  to  the 
next.  (For  convenience  we’ve  also  labelled  the  two  states  -  by  Off  and 
On,  respectively  -  but  these  labels  are  inessential:  they  do  not  add  any 
information  about  what  the  process  can  do  in  any  given  state.) 

As  a  slightly  more  complicated  example,  consider  a  simple  drinks  vending 
machine  which  accepts  a  50p  coin  and  allows  the  user  to  decide  whether  to 
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press  a  coffee  button  or  a  tea  button,  before  returning  to  its  initial  state. 
Its  behaviour  can  be  pictured  as  follows: 

coffee 
5  Op 

tea 


Again  we  have  two  states  represented  by  circles,  and  arrows  leading  from 
one  state  to  another,  each  labelled  by  the  action  which  causes  the  change  to 
the  state.  (In  this  case,  we’ve  not  bothered  labelling  the  states.) 

This  way  of  depicting  processes  is  captured  by  the  following  definition 
of  a  labelled  transition  system. 

(^Definition  11. l) 

A  labelled  transition  system  (LTS)  is  a  triple  T  =  (States,  Actions,  — >) 
consisting  of: 

•  a  set  States  of  states; 

•  a  set  Actions  of  actions  or  events;  and 

•  a  set  — >  C  States  x  Actions  x  States  of  transitions  between  states 
labelled  by  actions  (a  transition  relation). 


We  will  generally  write  s  -A  t,  or  the  more  pictorial  v£x  vl),  instead  of 
( s ,  a,  t)  6  — >,  meaning  that  in  state  s,  we  may  do  the  action  a  and  thereby 
evolve  into  the  state  t.  We  will  also  write  s  to  signify  that  there  are  no 
a-labelled  transitions  leading  out  of  state  s,  and  s  -ft  to  signify  that  there 
are  no  transitions  leading  out  of  s. 


^Definition  11.2J 

Given  an  LTS  T  =  (States,  Actions,  — >),  the  extended  transition  relation 
— >  C  States  x  Actions*  x  States  is  defined  inductively  as  follows.  (We  use 
the  notation  introduced  above  in  writing  s  -A  t  instead  of  (s,  t)  £  — >;  and 
give  two  clauses  in  our  inductive  definition:  one  base  case  for  the  empty 
string  e  and  one  inductive  case  for  aw  where  a  £  Actions  and  w  £  Actions*.) 

•  s  -A  s;  and 

•  s  t  if,  and  only  if,  s  -A  s'  t  for  some  s'. 

That  is,  for  w  =  aia2  •  •  •  a*:,  we  have  s  —t  t  if,  and  only  if, 
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.  s  ^ _ ^  t. 

As  depicted  in  the  examples  above,  labelled  transition  systems  are  typi¬ 
cally  presented  pictorially  with  states  represented  by  circles,  and  transitions 
represented  by  arrows  between  states  labelled  by  actions.  States,  actions  and 
transitions  are  exactly  the  properties  that  define  computational  processes. 


Figure  11.2  depicts  a  labelled  transition  system  with: 

•  state  set  States  =  {A,  B,  C,  D,  E,  F}\ 

•  action  set  Actions  =  {a,  b };  and 


•  transition  relation  — >  =  {  (A,  a,  A),  (A,  a,  C ),  (A,  a,  E ), 

(B,a,B),  ( B,a,C ),  ( B,a,F ), 
(' C,a,C ),  ( C,a,D ),  (D,b,E), 

( E,a,C ),  ( E,a,E ), 

(■ F,a,C ),  (F,a,D),  (F,b,F)}. 


Labelled  transition  systems  provide  an  ideal  tool  for  modelling  situations 
which  evolve  over  time,  as  in  the  following  example. 


Example  11.3  J  The  Man-Wolf-Goat-Cabbage  Riddle 


The  following  is  a  very  old  riddle  -  in  fact  it  was  posed  by  Alcuin  of  York  in 
the  8th  century  (and  solved  in  2009  by  Homer  Simpson  in  The  Sampsons 
episode  titled  Gone  Maggie  Gone).  It  reads  as  follows. 
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A  man  needs  to  cross  a  river  with  a  wolf,  a  goat,  and  a  cabbage. 
His  boat  is  only  large  enough  to  carry  himself  and  one  of  his  three 
possessions,  so  he  must  transport  them  one  at  a  time.  However,  if 
he  leaves  the  wolf  and  the  goat  together  unattended,  then  the  wolf 
will  eat  the  goat ;  similarly,  if  he  leaves  the  goat  and  the  cabbage 
together  unattended,  then  the  goat  will  eat  the  cabbage.  How  can 
the  man  get  across  safely  with  his  possessions? 

Initially  all  four  entities  are  on  one  side  of  the  river  (the  left-hand  side,  say). 
We  can  represent  this  state  of  affairs  by  (mwgc  sP>  By  this  we  mean  that 
the  man  m,  wolf  w,  goat  g  and  cabbage  c  are  all  on  the  left-hand  side  of  the 
river  (to  the  left  of  the  wiggly  lines  representing  the  river),  while  nothing  is 
on  the  right-hand  side  of  the  river  (to  the  right  of  the  wiggly  lines). 

Prom  the  initial  state  the  man  can  do  one  of  four  things. 

1.  He  may  cross  the  river  with  the  goat,  leaving  the  wolf  and  cabbage 
together  on  the  left-hand  side  of  the  river.  We  represent  the  resulting 
state  by  (wc  ill  mg),  denoting  that  the  wolf  and  cabbage  are  on  the 
left-hand  side  while  the  man  and  goat  are  on  the  right-hand  side.  Note 
that  this  labelling  of  the  state  is  purely  for  our  benefit  and  is  not  itself 
a  part  of  the  definition  of  the  process. 

Using  g  to  represent  the  action  of  the  man  crossing  the  river  with  the 
goat,  this  gives  us  the  following  transition: 

(mwgc  sT>  — 4  (wc  m  mg) 

2.  He  may  cross  the  river  with  the  wolf,  leaving  the  goat  and  cabbage 
together  on  the  left-hand  side  of  the  river.  We  represent  the  resulting 
state  by  (gc  ill  mw).  We  shade  this  state  to  indicate  that  this  is  an 
unacceptable  state  of  affairs,  as  the  goat  will  in  this  instance  eat  the 
cabbage.  Note  however  that  this  shading  -  just  like  the  labelling  of 
the  state  -  is  not  in  itself  a  part  of  the  definition  of  the  process. 

Using  w  to  represent  the  action  of  the  man  crossing  the  river  with  the 
wolf,  this  gives  us  the  following  transition: 

(MWGC  jf) - >  (GC  III  MW) 

3.  He  may  cross  the  river  with  the  cabbage,  leaving  the  wolf  and  goat 
together  on  the  left-hand  side  of  the  river.  The  resulting  state  will 
then  be  represented  by  (wg  ill  mc).  Again  the  shading  indicates  that 
this  is  an  unacceptable  state  of  affairs,  as  the  wolf  will  eat  the  goat. 

Using  c  to  represent  the  action  of  the  man  crossing  the  river  with  the 
cabbage,  this  gives  us  the  following  transition: 

(MWGC  sT)  — (WG  III  MC) 
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4.  He  may  cross  the  river  on  his  own,  leaving  the  wolf,  goat  and  cabbage 
together  on  the  left-hand  side  of  the  river.  The  resulting  state  will 
then  be  represented  by  (wgc  ill  m>  Once  again  the  shading  indicates 
that  this  is  an  unacceptable  state  of  affairs,  as  the  wolf  will  eat  the 
goat,  which  may  itself  have  had  the  time  and  opportunity  to  first  eat 
the  cabbage. 

Using  m  to  represent  the  action  of  the  man  crossing  the  river  alone, 
this  gives  us  the  following  transition: 

(MWGC  8f> - (WGC  in  m) 

It  is  clear  from  the  above  considerations  that  the  man  initially  has  just  one 
viable  option,  which  is  to  cross  the  river  with  the  goat. 

In  this  fashion,  we  can  model  the  problem  using  a  labelled  transition 
system,  using  states  to  represent  the  possible  states  of  affairs,  and  transitions 
to  represent  the  actions  available  to  the  man.  There  will  be  a  total  of  16 
states  in  this  LTS: 

•  The  initial  state  is  represented  by  the  state  (mwgc  and  the  de¬ 
sired  final  state  is  represented  by  the  state  (J8  mwgc)> 

•  There  are  8  further  safe  states,  namely 

(mwg  m  c)  (mwc  m  G)  (MGC  in  W>  (MG  III  wc) 

(C  in  MWGj)  (G  III  MWC)  (W  in  MGC)  (WC  III  MG) 

•  Finally  there  are  6  dangerous  states,  namely 

(M  m  WGCl  (MW  in  GC)  (MC  III  WG) 

(WGC  in  M)  (GC  in  MW)  (WG  III  MC) 

The  transitions  will  be  labelled  according  to  which  of  the  four  actions  the 
man  takes: 

m:  the  man  crosses  the  river  on  his  own. 
w:  the  man  crosses  with  the  wolf. 
g:  the  man  crosses  with  the  goat, 
c:  the  man  crosses  with  the  cabbage. 

The  resulting  LTS  is  presented  in  Figure  11.3.  From  this  LTS  we  can  readily 
read  off  a  solution  to  the  riddle  (which  is  not  unique)  by  following  a  path 
from  the  initial  state  to  the  final  state  which  passes  only  through  safe  states. 

Again  note  that  the  labelling  of  the  states,  including  the  shading  of  what 
we  recognise  to  be  dangerous  states,  is  not  part  of  the  definition  of  an  LTS. 
This  is  included  solely  for  our  own  convenience. 


286  Modelling  Processes 


(^Exercise  11.3^)  (Solution  on  page  459) _ 

Three  missionaries  are  travelling  with  three  cannibals  when  they  come  upon 
a  river.  They  have  a  boat,  but  it  can  only  hold  two  people.  The  river  is 
filled  with  piranha,  so  they  all  must  eventually  cross  in  the  boat;  no  one 
can  cross  the  river  by  swimming.  The  problem  is:  should  the  cannibals  ever 
outnumber  the  missionaries  on  either  side  of  the  river,  the  outnumbered 
missionaries  would  be  in  deep  trouble.  Each  missionary  and  each  cannibal 
can  row  the  boat. 

How  can  all  six  get  across  the  river  safely? 
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(Exercise  11.4^)  (Solution  on  page  461) 

In  the  1995  film  Die  Hard:  With  a  Vengence,  New  York  Detective  John  Mc- 
Clane  (played  by  Bruce  Willis)  and  Harlem  dry  cleaner  Zeus  Carver  (played 
by  Samuel  L.  Jackson)  had  to  solve  the  following  problem  in  order  to  pre¬ 
vent  a  bomb  from  exploding  at  a  public  fountain.  Given  only  a  five-gallon 
jug  and  a  three-gallon  jug,  neither  with  any  markings  on  them,  they  had  to 
fill  the  larger  jug  with  exactly  four  gallons  of  water  from  the  fountain,  and 
place  it  onto  a  scale  in  order  to  stop  the  bomb’s  timer  and  prevent  disaster. 
How  did  they  manage  this  feat? 


Exercise  11.5J  (Solution  on  page  461) 

You  are  sitting  in  a  pub  wearing  a  blindfold,  and  I  put  in  front  of  you  a 
square  tray  with  a  beer  mat  in  each  of  the  four  corners,  each  of  which  is 
either  face-up  or  face-down,  but  not  all  the  same. 

You  reach  out  and  -  blindly  feeling  your  way  around  the  tray  -  you  turn 
over  as  many  beer  mats  as  you  wish.  When  you  are  through,  if  the  beer 
mats  are  all  oriented  the  same  way  (either  all  face-up  or  all  face-down)  then 
you  win.  Otherwise,  I  will  rotate  the  tray  by  an  arbitrary  amount,  and  let 
you  try  again. 

What  strategy  will  guarantee  that  you  win  the  game? 


('ll .2)  Computations  and  Processes 

Consider  the  following  algorithm,  attributed  to  Euclid  (c.  300  BC),  for  com¬ 
puting  the  greatest  common  divisor  (GCD)  of  two  numbers  x  and  y,  that  is, 
the  largest  integer  which  evenly  divides  both  x  and  y.  (In  the  code  below, 
the  modulus  function  x  mod  y  simply  returns  the  remainder  when  dividing 
x  by  y,  and  :=  represents  the  assignment  operation.) 

loop  begin 

x  :  =  x  mod  y ; 
if  x=0  then  return  y; 
y  :  =  y  mod  x ; 
if  y=0  then  return  x 
loop  end 

This  algorithm  repeatedly  “executes”  the  four  lines  of  code  between  “loop 
begin”  and  “loop  end”  until  a  value  is  returned.  For  example,  if  we  apply 
this  algorithm  to  the  values  2=246  and  y= 174,  we  get  the  value  6  returned, 
which  is  indeed  the  GCD  of  246  and  174. 
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To  understand  how  this  program  works,  we  might  try  hand-turning  it, 
keeping  track  of  the  state  (i.e. ,  values)  of  the  variables.  For  example,  starting 
in  the  state  in  which  the  variables  have  the  values  x=246  and  y— 174,  the 
first  action  which  takes  place  is  the  assignment  x  :  =  x  mod  y;  this  action  has 
the  effect  of  changing  the  state  of  the  system  by  updating  the  value  of  x. 

This  computation  is  captured  by  the  labelled  transition  system  depicted 
in  Figure  11.4. 


Exercise  11.6J  (Solution  on  page  462) _ 

Consider  the  transition  system  depicted  in  Figure  11.4. 

1.  How  many  states  are  there?  List  them. 

2.  How  many  distinct  actions  are  there?  List  them. 

3.  How  many  transitions  are  there?  List  them. 


As  an  example  of  a  more  abstract  process,  consider  the  workings  of  the 
simple  table  lamp  represented  in  Figure  11.5  which  has  a  string  to  pull  for 
turning  the  light  on  and  off,  and  a  reset  button  which  resets  the  circuit  if  a 
built-in  circuit  breaker  breaks  when  the  light  is  on.  At  any  moment  in  time 
the  lamp  can  be  in  one  of  three  states: 

•  Off  -  in  which  the  light  is  off  (and  the  circuit  breaker  is  set); 

•  On  -  in  which  the  light  is  on  (and  the  circuit  breaker  is  set);  and 

•  Broken  -  in  which  the  circuit  breaker  is  broken  (and  the  light  is  off). 


In  any  state  the  string  can  be  pulled,  causing  a  transition  into  the  appro¬ 
priate  new  state  (from  the  state  Broken,  the  new  state  is  the  same  state 
Broken).  In  the  state  On,  the  circuit  breaker  may  break,  causing  a  transi¬ 
tion  into  the  state  Broken  in  which  the  reset  button  has  popped  out;  from 
this  state,  the  reset  button  may  be  pushed,  causing  a  transition  into  the 
state  Off. 


Exercise  11.7  )  (Solution  on  page  463) 

Extend  the  lamp  process  by  adding  actions  "blow"  and  “ replace ”,  which 
model  the  blowing  and  replacing  of  the  light  bulb.  Assume  that  the  bulb  can 
only  blow  when  the  light  is  on,  and  that  only  a  blown  bulb  can  be  replaced. 
Keep  in  mind  that  the  string  can  still  be  pulled  even  if  the  bulb  is  blown, 
and  that  when  a  bulb  is  replaced,  the  lamp  may  be  on  or  off  depending  on 
the  pulls  of  the  string. 


Example  11.7 


We  can  model  a  very  simple  clock  that  does  nothing  but  tick  repeatedly 
forever  as  follows: 


tick 


This  simple  transition  system  has  only  one  state  Cl  and  one  transaction 

Cl  - >  Cl,  but  it  can  be  “unrolled”  into  the  following  infinite-state  transi¬ 

tion  system: 


tick  tick  tick  tick  tick 
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Here,  the  state  Cl  is  reproduced  infinitely-many  times;  however,  these  two 
transition  systems  display  identical  behaviours. 

A  more  realistic  clock  will  typically  tick  only  a  finite  number  of  times  and 
then  stop.  For  example,  a  clock  which  ticks  exactly  3  times  before  falling 
silent  would  be  modelled  as  follows: 

tick  tick  tick 


Here  the  state  Cl3  represents  the  state  of  interest.  However,  we  can  see  that 
the  additional  states  Cl2,  Clj  and  Cl0  represent  clocks  which  tick  exactly  2, 
1,  and  0  times,  respectively.  We  can  immediately  generalise  this  example  to 
model  an  infinite  number  of  states  Cl„  (for  n  e  N)  where  state  Cl„  represents 
a  clock  which  ticks  exactly  n  times  before  falling  silent. 


tick  tick  tick  tick  tick 


This  is  very  similar  to  the  unrolled  version  of  the  infinitely-ticking  clock 
above;  they  both  have  an  infinite  number  of  states,  with  each  state  making 
a  single  tick  action  to  get  to  the  next  state.  However,  in  this  system  with 
the  different  states  Cl„,  each  tick  action  takes  us  one  step  closer  to  the  state 
Cl0  in  which  the  clock  stops.  In  the  unrolled  system  above  each  state  is  the 
same  as  any  other  state  in  that  an  infinite  number  of  tick  actions  can  be 
performed  from  it.  In  particular,  there  is  no  state  like  Cl0  from  which  no 
tick  action  can  occur. 


(^Exercise  11.8J  (Solution  on  page  463) _ 

Draw  a  model  of  a  Clock  Cl*  which  can  tick  any  number  of  times,  but  may 
stop  ticking  after  any  tick.  How  many  states,  actions,  and  transitions  does 
your  model  have? 


(^Example  11.8^) 

In  this  example  we  consider  a  simple  elevator  which  moves  between  three 
floors.  The  state  of  the  elevator  reflects  three  entities: 

•  Which  floor  it  is  at  or,  if  it  is  between  floors,  which  pair  of  floors  it 
is  travelling  between;  along  with  the  direction  it  is  travelling.  This 
information  will  be  one  of  the  following: 
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1:  at  1  (heading  up); 

2^:  at  2  (heading  up); 

2-1:  at  2  (heading  down); 
3:  at  3  (heading  down); 


1  +  :  moving  from  1  to  2; 
2+:  moving  from  2  to  3; 
2~:  moving  from  2  to  1; 
3“:  moving  from  3  to  2. 


•  Whether  the  door  is  open  or  closed. 

•  Which  set  of  floors  it  has  to  travel  to  -  and  in  which  direction  -  due 
either  to  a  call  button  being  pressed  on  a  floor,  or  a  floor  button 
being  pressed  in  the  elevator.  This  information  will  be  a  subset  of  the 
following: 


1:  collect  from  or  drop  off  at  floor  1; 

21':  collect  from  or  drop  off  at  floor  2  while  heading  up; 

2-1-:  collect  from  or  drop  off  at  floor  2  while  heading  down; 
3:  collect  from  or  drop  off  at  floor  3. 


A  state,  therefore,  will  be  of  the  form  (floor,  door,  stops )  where 

floor  i  {1,  2%  2-i,  3  }  U  {  1+,  2+,  2~ ,  3±}; 
door  E  {open,  closed };  and 
stops  C  {1,  2\  2l,  3}. 

There  will  therefore  be  as  many  as  8  x  2  x  24  =  256  states. 


Exercise  11.9J  (Solution  on  page  463) _ 

Augment  the  above  description  of  the  states  of  the  elevator  system  by  de¬ 
scribing:  the  set  of  possible  actions;  when  (in  which  states)  each  possible 
action  can  occur;  and  how  the  state  changes  when  that  action  occurs.  (Of 
course,  with  256  states,  it  is  unreasonable  to  draw  out  this  labelled  transition 
system,  so  don’t  even  try!) 


Exercise  ll.lOj  (Solution  on  page  466) _ 

Consider  the  process  of  flipping  a  coin,  in  which  the  following  three  actions 
are  possible  at  different  times: 

•  a  toss  action  in  which  the  coin  is  tossed  into  the  air; 

•  a  heads  action  in  which  the  coin  lands  with  heads  showing;  and 

•  a  tails  action  in  which  the  coin  lands  with  tails  showing. 
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Upon  doing  a  toss  action,  either  a  heads  action  or  a  tails  action  will  occur, 
and  the  process  will  be  back  in  its  initial  state  in  which  the  coin  may  once 
again  be  flipped. 

Draw  two  different  models  of  this  process:  the  first  in  which  the  outcome 
of  the  flip  is  determined  already  when  the  coin  is  tossed,  and  the  second  in 
which  the  outcome  of  the  flip  is  determined  only  when  it  is  observed.  What 
are  the  implications  of  these  different  interpretations  of  determinism?  Which 
do  you  consider  to  be  the  most  realistic  model? 


(ll.3)  A  Language  for  Describing  Processes 

Drawing  processes  graphically  is  fine  for  small  examples.  However,  you 
would  never  draw  the  labelled  transition  system  for  even  moderately-complex 
processes.  For  example,  drawing  the  labelled  transition  system  for  the  ele¬ 
vator  in  Example  11.8  above  would  not  only  be  tedious  and  error- prone,  it 
would  not  be  very  insightful. 

We  will  need  a  proper  language  for  describing  bigger  and  more  compli¬ 
cated  systems.  A  formal  description  language  can  also  be  programmed  and 
analysed  (verified)  on  a  machine.  In  this  section  we  shall  present  such  a 
language.  The  language,  which  we  refer  to  as  Proc,  will  have 

•  (process)  variables,  such  as  Off,  On,  and  Broken;  and 

•  events  or  actions,  such  as  pull,  break,  and  reset. 

Every  expression  in  the  language  will  represent  a  state  in  a  labelled  transition 
system.  Each  process  variable  is  itself  an  expression  in  the  language,  and  all 
of  the  expressions  in  the  language  will  be  built  up  from  actions  and  process 
variables  using  simple  operations  for  combining  them.  In  the  remainder  of 
this  section  we  shall  explore  the  two  basic  operations  in  the  language:  action 
prefix  and  choice,  as  well  as  the  means  by  which  processes  are  defined. 

11.3.1  The  Nil  Process  0 

The  most  basic  process  expression  in  the  language  is  0,  which  is  referred  to 
as  the  nil  process  and  represents  a  state  which  has  no  transitions  leading 
out  of  it: 


For  example,  the  state  Cl0  from  Example  11.7  is  an  example  of  the  nil  pro¬ 
cess.  A  process  which  evolves  into  such  a  state  0  is  said  to  have  deadlocked. 
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11.3.2  Action  Prefix 

If  a  is  an  action  and  E  is  a  process  expression,  then  the  action  prefix 
expression  a.E  represents  a  state  in  a  process  which  has  one  transition:  an 
a-transition  leading  to  the  state  represented  by  E: 


a 


As  an  example,  the  clock  Cli  which  ticks  once  and  then  falls  silent  is  repre¬ 
sented  by  the  expression 

tick.  0 

which  depicts  the  following  process: 


tick 


By  repeatedly  applying  the  action  prefix  operation,  we  can  express  the  clock 
Cl3  which  ticks  three  times  before  falling  silent  as  follows: 

tick.tick.  tick.  0 

which  depicts  the  following  process: 


For  an  example  based  on  the  lamp  process  of  Figure  11.5,  if  pull  is  an 
action  and  On  is  a  process  variable  (and  hence  a  valid  expression),  the  action 
prefix  expression 

pull.  On 

represents  the  following  state  with  a  single  pull  transition: 
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11.3.3  Process  Definitions 

Each  process  variable,  being  a  process  expression,  represents  a  state  in  a 
labelled  transition  system,  and  as  such  stands  for  some  process.  Every 
process  variable  X  must  therefore  have  a  defining  equation 

X  =  E 

where  E  is  the  process  expression  for  which  X  stands.  The  transitions 
leading  from  the  state  represented  by  X  are  determined  by  (that  is,  are  the 
same  as)  those  leading  from  the  state  represented  by  E. 

For  example,  in  the  simple  table  lamp  process  of  Figure  11.5,  the  state 
represented  by  the  process  variable  Off  has  a  single  transition  leading  out 
of  it  labelled  pull  and  leading  into  the  state  represented  by  the  variable  On. 
The  process  variable  Off  can  thus  be  defined  as  follows: 

Off  =f  pull. On 

The  one  and  only  transition  from  the  state  represented  by  the  expression 
pull. On  is 


pull 


and  since  Off  -  by  definition  -  has  the  same  transitions  leading  out  of  it 
as  pull. On,  the  one  and  only  transition  from  the  state  represented  by  the 
expression  Off  is 


The  process  definition  X  =  E  thus  defines  which  transitions  are 
possible  from  the  state  represented  by  the  variable  X,  namely  precisely 
those  possible  from  the  expression  E.  In  other  words,  X  is  defined  to  be 
identical  in  behaviour  to  E.  Formally, 

if  X  =  E  and  E1 

then  X  A  E'. 

In  the  lamp  process,  since 

def  Putt 

Off  =  pull. On  and  pull. On  — ?•  On, 
we  have  the  transition 


Off 


On. 
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11.3.4  Choice 

The  state  Off  in  the  lamp  process  is  particularly  simple  as  there  is  only  one 
transition  leading  out  of  it.  However,  the  other  two  states  offer  a  choice  of 
transitions: 


•  from  the  state  On  there  is  a  transition  pull  leading  to  the  state  Off 
and  a  transition  break  leading  to  the  state  Broken;  and 

•  from  the  state  Broken  there  is  a  transition  pull  leading  back  to  the 
state  Broken  and  a  transition  reset  leading  to  the  state  Off. 

Thus,  the  state  On  may  behave  either  like  the  state  pull. Off  or  like  the 
state  break. Broken,  and  the  state  Off  may  behave  either  like  the  state 
PuH.Broken  or  like  the  state  reset. Off. 

Such  choices  between  behaviours  are  catered  for  in  the  language  Proc 
by  the  choice  operation:  given  expressions  E  and  F,  the  expression  E  +  F 
represents  the  process  state  which  has  all  of  the  transitions  of  E  and  of  F; 
in  essence,  it  can  behave  as  either  E  or  as  F,  with  the  choice  being  taken  at 
the  moment  the  first  transition  occurs.  More  formally, 

if  E  A  E'  then  E  +  F  A  E1,  and 
if  F  A  F'  then  E  +  F  A  F1. 


Referring  to  the  lamp  example,  from  the  state  On  we  can  either  perform  a 
pull  action  to  go  to  state  Off  or  a  break  action  to  go  to  state  Broken;  and 
from  the  state  Broken  we  can  either  perform  a  pull  action  to  go  to  state 
Broken  or  a  reset  action  to  go  to  state  Off.  The  two  process  variables  On 
and  Broken  thus  have  the  following  definitions: 


On  =  pull.  Off  +  break.  Broken 

def 

Broken  =  pull.  Broken  +  reset.  Off 
By  the  above  rules  for  choice,  we  thus  have  the  transitions 

pull 

pull.  Off  +  ireafc.BROKEN  - i  Off  and 

pull. Off  +  hreafc.BROKEN  Broken 

and  hence  (by  the  process  definition  operation)  the  transitions 

~  pull  „  break 

On  - i  Off  and  On  - >  Broken. 


From  state  On  we  can  chose  to  perform  action  pull  to  go  into  state  OFF  or 
to  perform  action  break  to  go  into  state  Broken. 

By  analogous  reasoning  we  can  infer  the  following  two  transitions  for  the 
state  represented  by  the  variable  Broken: 


Broken 


pull 


Broken  and 


Broken 


reset 


Off. 
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(^Exercise  ll.llj)  (Solution  on  page  467) 


Explain  how  these  last  two  transitions  for  Broken  can  be  inferred. 


We  can  extend  the  choice  operation  in  the  natural  way  to  any  finite  sum 


E\  +  E2  +  •  •  •  +  En 

to  describe  a  choice  of  doing  the  actions  of  any  of  the  summand  process 
terms  Et  (where  1  <  i  <  n).  We  shall  also  sometimes  write  the  choice 
operation  as  an  indexed  sum;  that  is,  instead  of  writing  Ei  +  E2  +  ■  ■  ■  +  En 
we  may  write,  for  example,  any  of  the  following: 

n 

J2Ei  or  J2Ei  or  T,{E1,E2,...,En}  or  T,{Et  :  l<i<n}. 

i=  1  l<i<n 

More  generally,  given  any  (possibly  infinite)  set  5  of  process  expressions,  we 
can  write  £  S  to  represent  a  choice  over  all  of  these.  For  example,  instead 
of  writing 

def 

On  =  pull.  Off  +  hreafc.BROKEN 
we  could  write 

On  =f  £{pul£ Off,  break. Broken  }. 

The  transitions  that  the  process  expression  £  S  can  perform  are  determined 
by  the  process  expressions  in  5.  Formally, 

if  E  e  S  and  E  A  E1  then  £  S  A  E' . 

Furthermore,  as  5  may  be  an  infinite  set  of  process  expressions,  we  may  use 
this  notation  to  express  infinite  choices.  For  example,  the  infinite  choice 

Ei  +  E2  +  E2  +  Ei  +  •  •  • 

can  be  written  as  £;>1  Et. 

An  interesting  case  of  this  generalised  choice  operation  occurs  when  5 
is  the  empty  set  0.  The  process  expression  £  0,  by  definition,  can  make  no 
transitions,  and  thus  provides  a  definition  of  the  nil  process:  0  £  0. 

The  syntax  (notation)  and  semantics  (meaning)  of  the  language  Proc 
is  summarised  in  Figure  11.6. 

(^Example  11.1 1^) 

Continuing  with  Example  11.7,  we  can  give  the  following  process  definitions 
to  the  simple  clock  Cl  that  ticks  forever,  and  the  clocks  Cl„  (for  n  e  N) 
which  tick  exactly  n  times: 

Cl  =f  tick.  Cl  Cl0  =f  0  Cl„+1  =f  tick.  Cl„ 
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Name  Syntax  Semantics 


Process  Variable  X 

If  E  A  E'  and  X  =  E 
then  X  A  E1 

Action  Prefix 

a.E 

a.E  — f  E 

Choice  (1) 

E  +  F 

If  E  A  E' 

then  E  +  F  A  E1 

If  F  A  F' 

then  E  +  F  A  F" 

Choice  (2) 

Yl{Ei  :  i  £  /} 

If  Ej  A  E'  with  j  el 
then  £{£?;  :  i  e  1}  A  E' 

Nil 

0 

no  transitions  (X  =f  Y  0) 

Figure  11.6:  Syntax  and  semantics  of  the  process  language  Proc. 


Consider  now  a  clock,  which  we  will  call  Clock,  which  may  tick  some 
finite  but  indeterminate  number  of  times  depending  on  the  amount  of  energy 
powering  it  and  then  stop.  There  is  no  way  of  knowing  a  priori  how  many 
times  it  will  tick;  it  will  tick  once  when  it  is  started,  and  then  continue  to 
tick  until  its  energy  source  is  depleted.  Thus,  after  the  first  tick,  the  clock 
will  be  in  a  state  in  which  it  will  tick  again  some  precise  finite  number  of 
times.  That  is,  it  will  be  in  the  state  Cl„  for  some  n  e  N.  Its  definition  as 
a  process  is  given  as  follows: 

Clock  =f  y)  tick. Cla. 

•>o 

Finally,  consider  yet  another  clock,  which  we  will  call  Clock*,  which  may 
behave  like  Clock,  but  might  decide  -  upon  performing  the  first  tick  -  to 
continue  ticking  forever.  That  is,  it  has  the  possibility  to  evolve  into  the 
state  Cl  after  the  first  tick.  Its  definition  as  a  process  is  given  as  follows: 

Clock*  =f  yficfe.Cli  +  tick. Cl. 

!>0 

These  processes  all  appear  in  the  transition  system  depicted  in  Figure  11.7. 


^Exercise  11.12J  (Solution  on  page  467) 


Give  a  process  definition  for  the  Clock  process  Cl*  which  you  defined  in 
Exercise  11.8. 


(^Exercise  11.13)  (Solution  on  page  467) 

Design  a  simple  change-making  process  which  will  initially  accept  a  5p,  lOp 
or  20p  coin,  and  dispense  any  sequence  of  lp,  2p  and  5p  coins  which  sum 
up  to  the  value  of  the  coin  inserted,  before  returning  to  its  initial  state. 

To  do  this,  introduce  the  process  variables  C„  for  n  e  {0, 1,  2, ... ,  20}, 
and  the  following  actions: 

is:  insert  a  5p  coin  di:  dispense  a  lp  coin 

iio:  insert  a  lOp  coin  dr.  dispense  a  2p  coin 

ho:  insert  a  2 Op  coin  ds:  dispense  a  5p  coin 

Each  variable  C„  is  to  represent  the  process  in  the  state  in  which  n  pence 
remains  to  be  dispensed.  In  particular,  the  process  variable  C0  is  to  represent 
the  initial  state  of  the  process,  and  has  the  following  definition: 

dcf  .  .  . 

C*o  =  25.  C5  -)-  210.  Cio  +  220.  C20 

1.  Give  the  definitions  for  the  remaining  process  variables  Clt  C2, . . . ,  C20. 

2.  Draw  the  labelled  transition  system  representing  this  process. 
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V\  =  10p.l0p.(  cof fee . collect . V\ 

+  tea. collect . Vi ) 

dfif 

V2  =  10p.(  lOp. coffee. collect .V2 
+  1 Op. tea. collect .V2 ) 

V3  =  10p. 10p. cof fee. collect. V3 
+  1 Op. lOp. tea. collect .V3 

Figure  11.8:  Three  implementations  of  a  vending  machine. 


(ll.4)  Distinguishing  Between  Behaviours 

Consider  the  problem  of  designing  a  simple  vending  machine  which  will  allow 
a  user  to  insert  two  lOp  coins  in  succession,  and  then  push  a  coffee  or  a 
tea  button;  the  user  will  then  be  allowed  to  collect  the  relevant  beverage, 
after  which  the  machine  will  return  to  its  original  state,  permitting  the  next 
person  to  use  it. 

This  informal  description  is  typical  of  how  an  actual  “specification”  may 
appear,  but  demonstrates  the  sort  of  ambiguity  which  arises  in  such  loose 
specifications.  The  problem  in  this  case  stems  from  the  ambiguity  of  the 
word  “or.”  The  vending  machine  might  be  implemented  by  any  one  of  the 
three  programs  in  Figure  11.8.  We  can  draw  these  three  processes  as  in 
Figure  11.9.  Note  that  the  states  of  these  transition  systems,  though  not 
spelt  out,  all  represent  expressions  of  the  language  Proc.  For  example,  the 
four  states  of  the  first  process,  from  left  to  right,  represent  the  following 
expressions: 

V! 

lOp .  (cof fee  .  collect .  Vi  +  tea.  collect .  T/j) 
cof  fee  .  collect .  V\  +  tea.  collect .  V\ 
collect .  Vi 

The  transitions  are  easily  derived  using  the  semantic  rules  for  inferring  tran¬ 
sitions.  For  example, 

lOp 

T/l  — ->  lOp .  (cof  fee  .  collect .  Vi  +  tea.  collect .  Vi ) 


since 

def 

V1  =  lOp .  lOp .  (cof fee .  collect .  V\  +  tea.  collect .  Vi) 


and  by  the  action  prefix  rule, 


lOp. lOp. (coffee . collect .  V\  +  tea. collect .Vi) 


lOp . (coffee . collect .Vi  +  tea. collect .  V\ ) . 


a 


Exercise  11.14 


Q  (Solution  on  page  467) 


List  the  states  of  the  other  two  vending  machine  processes. 


Clearly  the  behaviour  of  V1  is  different  from  the  behaviour  of  V2  and  V3. 
Specifically,  the  following  property  is  true  of  V1  but  not  true  of  V2  nor  of  V3. 

After  inserting  two  lOp  coins, 

we  are  guaranteed  to  be  able  to  press  the  coffee  button. 

In  other  words, 

No  matter  how  we  do  a  lOp  action, 
we  must  end  up  in  a  state  in  which, 
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no  matter  how  we  do  a  lOp  action, 
we  must  end  up  in  a  state  in  which 
we  may  do  a  coffee  action. 

In  contrast,  the  following  property  is  the  negation  of  the  above  property, 
and  as  such  is  true  of  V2  and  V2  but  not  V). 

We  may  do  a  lOp  action  and  end  up  m  a  state  in  which 

we  may  do  a  lOp  action  and  end  up  in  a  state  in  which 
we  cannot  do  a  coffee  action. 

Notice  that  the  negation  of  a  must  property  (i.e. ,  a  necessity)  is  a  may 
property  (a  possibility),  and  vice  versa. 

It  is  less  clear,  but  still  true,  that  the  behaviour  of  V2  is  different  from 
the  behaviour  of  V3.  In  particular,  the  following  property  is  true  of  V3  but 
not  true  of  V1  nor  of  V2. 

Already  after  inserting  the  first  lOp  coin, 

we  have  lost  either  the  possibility  of  selecting  a  coffee, 
or  the  possibility  of  selecting  a  tea. 

Even  simpler, 

We  may  do  a  lOp  action  and  end  up  in  a  state  in  which, 
no  matter  how  we  do  a  lOp  action, 
we  must  end  up  in  a  state  in  which 
we  cannot  do  a  tea  action. 


Exercise  11.15  J  (Solution  on  page  468) 


Negate  this  property  to  get  a  property  which  is  true  of  Vi  and  V2  but  not 
true  of  V3. 


The  question  then  is:  How  do  we  formally  distinguish  between  pro¬ 
cesses?  Clearly,  the  answer  to  this  question  lies  at  the  heart  of  the  problem 
of  verifying  the  correctness  of  systems.  Answering  this  question  is  the  goal 
of  the  next  chapter. 


B  =  A  +  b.(c.  0  +  d.O)  D  =  a.B 


1.  Draw  a  transition  system  which  includes  the  above  states  A,  B,  C 
and  D. 


2.  Explain  clearly  how  the  states  C  and  D  behave  differently. 
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(II.5)  Equality  Between  Processes 

At  the  start  of  Section  11.1  we  posited  that  a  state  of  a  process  is  completely 
determined  by  what  actions  may  occur  in  that  state,  and  what  new  states 
each  action  might  lead  to.  With  this  in  mind,  we  would  naturally  consider 
two  states  E  and  F  to  be  equal,  E  =  F,  whenever  it  is  the  case  that  for  all 
actions  a  and  states  G:  E  A  G  if,  and  only  if,  F  A  G.  That  is,  E  =  F 
whenever  the  following  is  true: 

•  if  EAG  then  F  A  G;  and 

•  if  F  A  G  then  E  A  G. 


^Exercise  11.17^  (Solution  on  page  469) 

Show  that  this  notion  of  equality  is  an  equivalence  relation  over  process 
expressions;  namely  that  it  is  reflexive,  symmetric  and  transitive. 


By  equating  states  of  a  process  in  this  way,  we  can  show  that  the  following 
equations  are  true  of  process  terms  defined  in  the  language  Proc. 

(5j)  E  +  0  =  E. 

(52)  E  +  E  =  E. 

(53)  E  +  F  =  F  +  E. 

(54)  ( E  +  F)  +  G  =  E  +  (F  +  G). 

(55)  If  X  =  E  then  X  =  E. 

Each  of  these  equations  is  easily  justified  by  considering  the  rules  by  which 
transitions  can  be  inferred. 

(^Example  11.17^) _ 

To  show  that  E  +  0  =  E,  we  need  to  confirm  the  validity  of  the  following 
two  propositions: 

•  if  E  +  OAG  then  E  A  G;  and 

•  if  E  A  G  then  E  +  oAg. 

The  second  proposition  follows  immediately  from  the  rule  for  choice. 

For  the  first  proposition,  if  E  +  0  A  G  then  the  rule  for  choice  says  that 
either  E  A  G  or  0  A  G;  but  since  there  are  no  transitions  leading  out  of 
0,  we  must  have  that  E  A  G  as  required. 


Exactly  when  two  states  should  be  deemed  equal  is  explored  in  detail  in 
the  next  chapter;  however,  the  above  equations  will  certainly  be  true.  Even 
further,  we  can  recursively  extend  the  notion  of  equality  between  states 
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by  declaring  two  states  E  and  F  to  be  equal,  E  =  F,  not  only  if  they 
possess  exactly  the  same  transitions,  but  whenever  each  transition  of  one 
can  be  matched,  up  to  equality,  by  the  other;  that  is,  E  =  F  whenever  the 
following  is  true: 

•  if  E  A  E1  then  F  A  F'  for  some  F1  such  that  E1  =  F'\  and 

•  if  F  A  F1  then  E  A  E1  for  some  F'  such  that  E'  =  F1. 

We  can  represent  this  situation  pictorially  as  follows: 

E  F 

|a  =  |“ 

E’  .  F’ 


With  this  refinement  to  the  notion  of  equality  between  states,  we  can  show 
the  following  equations  to  be  true. 

(Cx)  If  E  =  F  then  E  +  G  =  F  +  G. 

(C2)  If  E  =  F  then  a.E  =  a.F. 

These  are  important  properties  for  any  notion  of  equality  between  process 
terms,  as  they  ensure  that  a  term  does  not  change  when  we  replace  a  subterm 
within  the  term  by  an  equal  subterm. 

(^Exercise  11.18^)  (Solution  on  page  469) _ 

Consider  the  following  processes,  all  of  which  perform  a-transitions  over  and 
over,  ad  infinitum. 

A  =f  a. A,  and 

A,  =f  a.Ai+1  for  each  ieN. 

Clearly,  A  and  A0  exhibit  the  same  behaviour.  However,  explain  why  we 
cannot  infer  that  A  =  A0. 


(ll.6)  Additional  Exercises 

1.  As  we  saw  from  Example  11.3,  modelling  puzzles  have  a  long  history. 
Water  jug  puzzles  of  the  type  presented  in  Exercise  11.4  are  referred 
to  as  Tartaglian  water  measuring  problems  as  they  were  favourites 
of  the  16th-century  Italian  mathematician  Niccolo  Tartaglia  (though 
these  days  you’d  no  doubt  be  more  successful  searching  online  for 
“Diehard  water  puzzle”  than  “Tartaglian  water  puzzle”).  In  fact,  the 
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problem  faced  by  John  and  Zeus  in  Exercise  11.4  was  adapted  from 
the  following  puzzle  posed  by  Abbot  Albert  in  the  13th  century. 

Given  an  eight-unit  jug  filled  with  water,  an  empty  five-unit  jug  and 
an  empty  three-unit  jug,  how  can  we  divide  the  water  into  two  parts, 
each  exactly  four-unit?  (None  of  the  jugs  have  any  markings  on  them, 
and  we  cannot  estimate  quantities  by  eye;  we  can  only  measure  exact 
quantities  by  pouring  water  from  one  jug  to  another  until  one  of  the 
two  jugs  becomes  either  full  or  empty.) 

2.  (a)  Three  married  couples  wish  to  cross  a  river.  Their  boat,  however, 

can  only  carry  two  people  at  a  time.  Also,  the  husbands  are  very 
jealous:  each  one  of  them  refuses  to  let  his  wife  be  in  the  presence 
of  another  man  unless  he  himself  is  present  as  well.  How  can  they 
cross  the  river  using  the  fewest  number  of  trips? 

(b)  Argue  that  the  above  problem  cannot  be  solved  if  you  have  four 
couples  wanting  to  cross  the  river. 

(c)  Show  that  five  couples  can  cross  the  river  in  a  boat  that  can  carry 
three,  but  that  six  couples  cannot. 

3.  Alice,  Bob,  Carol  and  Dave  want  to  cross  a  river  in  a  boat.  However, 
their  boat  can  only  hold  100  kg.  Alice  is  46  kg,  Bob  is  49  kg,  Carol  is 
52  kg  and  Dave  is  60  kg.  Also,  Bob  has  a  broken  arm  and  can’t  row. 

How  can  they  all  get  across  the  river? 

4.  Alice,  Bob,  Carol  and  Dave  want  to  cross  a  bridge  in  the  dark  of  night. 
However,  the  bridge  is  rigged  with  a  bomb  which  is  due  to  explode, 
destroying  the  bridge,  in  17  minutes.  They  have  one  flashlight  which 
must  be  used  when  crossing  the  bridge,  but  the  bridge  can  hold  only 
two  people  at  once.  Their  walking  speeds  allow  them  to  cross  in  1,  2, 
5  and  10  minutes,  respectively;  when  two  of  them  cross  together,  they 
must  walk  together  with  the  flashlight. 

How  can  they  all  get  safely  across  the  bridge? 

5.  This  question  considers  further  bridge-crossing  problems  based  on  that 
in  question  4. 

(a)  How  quickly  can  six  people  cross  a  bridge  two-at-a-time  aided  by 
a  single  flashlight,  if  their  crossing  times  are  1,  3,  4,  6,  8  and  9 
minutes,  respectively? 

(b)  How  quickly  can  seven  people  cross  a  bridge  three-at-a-time  aided 
by  a  single  flashlight,  if  their  crossing  times  are  1,  2,  6,  7,  8,  9 
and  10  minutes,  respectively? 

6.  A  queen,  her  son,  and  her  daughter  are  being  held  captive  in  the  tower 
of  a  castle.  Outside  the  tower  window  is  a  rope  running  over  a  pulley 
with  baskets  of  equal  weight  attached  to  the  ends  of  the  rope.  One 


Additional  Exercises  305 


basket  is  empty  and  is  outside  the  window,  while  the  other  basket  is  on 
the  ground  with  a  30  kg  rock  in  it.  One  basket  can  be  safely  lowered 
to  the  ground  using  the  other  basket  as  a  counterbalance  as  long  as  the 
difference  in  weight  between  the  two  baskets  does  not  exceed  6  kg;  if 
one  basket  is  more  than  6  kg  heavier  than  the  other,  the  heavier  basket 
will  crash  to  the  ground.  The  queen  weighs  78  kg,  her  daughter  42  kg 
and  her  son  36  kg.  Each  basket  can  hold  two  people,  or  one  person 
and  the  rock. 

How  can  the  queen  and  her  children  escape  to  the  ground  using  the 
smallest  number  of  steps? 

7.  We  can  compute  the  value  of  x  mod  y  using  the  following  simple  algo¬ 
rithm: 


while  x  >  y  do  x:=x-y 
return  x 

(a)  Draw  the  transition  system  associated  with  the  computation  of 
the  value  72  mod  30. 

(b)  List  the  states,  actions  and  transitions  of  this  transition  system. 


8.  Consider  the  following  process  definition. 


X  =  a.O  +  a.Z 


Y  =f 


a.Z 


Z 


def 


a.Z 


(a)  Draw  the  labelled  transition  system  for  the  above  process. 

(b)  Explain  in  words  how  states  X  and  Y  differ,  behaviourally. 

9.  Argue  that  the  process  Clock2  given  by  the  process  definition 
Clock2  =f  ^Tci, 

i>0 

defines  the  same  process  as  the  process  Clock  from  Example  11.11. 

10.  Design  a  keypad  lock  which  has  three  buttons  labelled  A,  B  and  C. 
Any  of  the  keys  can  be  pressed  at  any  time,  and  if  the  correct  sequence 
of  5  key  presses,  namely  BBC  BA,  is  keyed  in,  then  the  lock  will  open. 

11.  In  this  question,  we  study  the  specification  of  a  car  safety  system,  in 
which  a  bell  rings  (repeatedly)  whenever  the  ignition  is  on  while  the 
door  is  open  or  the  seat  belt  is  unbuckled. 

The  labelled  transition  system  for  this  specification  is  pictured  in  Fig¬ 
ure  11.10  Here  we  have  a  system  with 

•  eight  states  S  —  ^  X\,  X2,  Xg,  X4,  Xg,  Xg,  X 7,  Xg  j ,  and 

•  seven  actions  A  —  {  open,  close,  buckle,  unbuckle,  on,  off,  ring  }. 
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Figure  11.10:  A  car  seat  belt  safety  system. 


For  example,  in  state  X4,  the  ignition  is  on,  the  seat  belt  is  buckled, 
the  door  is  open,  and  the  alarm  is  ringing. 

(a)  The  eight  states  in  5  can  be  given  process  definitions,  such  as 

dcf 

X4  =  oS.X2  +  open.X4  +  unbuckle.  X5 

Give  such  a  definition  for  each  of  the  state  variables  in  5. 

(b)  Let  D(x),  B(x),  M( x)  and  R(x )  be  predicates  defined  over  the 
states  5  as  follows: 

D(x)  =  “the  door  is  open  in  state  x." 

B{x)  =  “the  seat  belt  is  buckled  in  state  x." 

M(x)  =  “the  ignition  is  on  in  state  a:.” 

R(x)  =  “the  bell  is  ringing  in  state  x.” 


For  each  of  these  four  predicates,  indicate  the  states  for  which 
they  are  true. 
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12.  Adapt  the  elevator  system  from  Example  11.8  (page  290)  so  that  it 
serves  four  floors  rather  than  three. 

13.  Adapt  the  elevator  system  from  Example  11.8  (page  290)  so  that  it 
models  two  elevators  operating  side-by-side,  which  are  called  using 
the  same  call  buttons  on  each  floor. 

14.  Give  a  process  definition  for  the  behaviour  of  the  simple  calculator  of 
Figure  11.1  (page  280). 

15.  Justify  the  following  equalities  from  Section  11.5.  (The  first  one  was 
demonstrated  in  Example  11.17,  page  302.) 

(SO  £  +  0  =  E. 

(52)  E  +  E  =  E. 

( 53 )  E  +  F  =  F  +  E. 

(54)  ( E  +  F)  +  G  =  E  +  (F  +  G). 

(55)  If  X  =  E  then  X  =  E. 

16.  Justify  the  following  equalities  from  Section  11.5. 

(Ci)  If  E  =  F  then  E  +  G  =  F  +  G. 

(C2)  If  E  =  F  then  a.E  =  a.F. 


Chapter  12 


Distinguishing  Between 
Processes 


Satire  is  a  lesson,  parody  is  a  game. 

-  Vladimir  Nabokov,  Strong  Opinions. 

If  we  consider  the  properties  which  we  used  to  distinguish  between  the 
vending  machines  from  Section  11.4,  we  quickly  notice  an  analogy  with 
the  way  in  which  strategies  for  the  two-player  games  of  Chapter  10  were 
discussed;  they  both  rely  heavily  on  the  use  of  modal  verbs  such  as  may 
and  must  to  describe  capabilities.  In  this  chapter,  we  shall  exploit  this 
analogy  by  devising  a  two-player  game  for  distinguishing  between  two  given 
processes.  In  this  game, 

•  the  first  player  will  aim  to  show  that  the  two  processes  are  different,  by 
looking  for  an  action  that  one  process  can  do  which  the  other  cannot; 

•  the  second  player  will  aim  to  show  that  the  two  processes  are  the  same, 
by  showing  that  each  process  can  copy  every  action  made  by  the  other. 

In  this  game,  one  of  the  two  players  will  always  have  a  winning  strategy 
(draws  will  not  be  possible);  the  two  processes  will  be  declared  to  be  the 
same  if  the  second  player  has  a  winning  strategy,  and  different  if  the  first 
has  a  winning  strategy. 


(l2.l)  The  Bisimulation  Game 

In  this  game  we  start  by  choosing  two  process  states  E  and  F  (i.e. ,  two 
designated  states  of  some  transition  system).  For  example,  we  may  consider 
the  states  X  and  U  taken  from  the  first  of  the  two  transition  systems  depicted 
in  Figure  12.1.  We  may  also  define  an  a  priori  “time  limit”  of  n  e  N  moves, 
or  declare  that  the  game  has  no  time  limit  (i.e.,  take  n  =  oo).  A  game  thus 
defined  is  represented  either  by  Gn(E,  F )  or  Goc>(E,  F).  The  game  is  played 
between  two  players,  who  have  the  following  goals. 
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Undergraduate  Topics  in  Computer  Science, 
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1.  The  first  player  wishes  to  demonstrate  that  the  two  chosen  states  are 
in  some  way  inherently  different. 

2.  The  second  player  wishes  to  demonstrate  that  the  two  chosen  states 
are  inherently  the  same. 

To  play  the  game,  we  start  by  placing  tokens  on  the  two  states  E  and  F, 
and  then  proceed  as  follows. 

1.  The  first  player  chooses  one  of  the  two  tokens,  and  moves  it  forward 
along  an  arrow  to  another  state  of  her  choosing;  if  this  is  impossible 
(that  is,  if  there  are  no  arrows  leading  out  of  either  of  the  states  on 
which  the  tokens  sit),  then  the  second  player  is  declared  to  be  the 
winner. 

2.  The  second  player  must  move  the  other  token  forward  along  an  arrow 
which  has  the  same  label  as  the  arrow  used  by  the  first  player;  if  this 
is  impossible,  then  the  first  player  is  declared  to  be  the  winner. 

This  exchange  of  moves  is  repeated  for  as  long  as  neither  player  gets  stuck, 
or  for  a  total  of  n  exchanges  of  moves  in  the  case  where  a  finite  time  limit 
n  is  defined.  Note  that  the  first  player  gets  to  choose  which  token  to  move 
every  time  it  is  her  turn\  she  does  not  have  to  keep  moving  the  same 
token.  If  the  second  player  succeeds  in  matching  every  move  of  the  first 
player,  then  he  is  declared  to  be  the  winner.  If  there  is  no  time  limit,  then 
the  second  player  is  declared  to  be  the  winner  of  any  play  of  the  game  that 
goes  on  forever.  (It  may  seem  rather  strange  to  declare  a  player  to  be  the 
winner  of  a  play  which  lasts  forever.  However,  there  is  nothing  paradoxical 


The  Bisimulation  Game  311 


about  this,  and  by  doing  so  we  ensure  that  there  is  always  a  winner;  the 
game  cannot  end  in  a  draw.) 

(^Example  12. 1?) _ 

Suppose  we  start  with  the  tokens  on  states  X  and  U  of  the  first  transition 
system  of  Figure  12.1,  and  we  assume  that  the  time  limit  is  (at  least)  2. 

1.  the  first  player  can  move  the  token  on  state  U  along  the  a- labelled 
arrow  to  state  V;  in  response  the  second  player  must  move  the  token 
on  state  X  along  the  a-labelled  arrow  to  state  Y . 

2.  The  first  player  can  then  move  the  token  on  state  Y  along  the  c- labelled 
arrow  to  state  Z\  the  second  player  cannot  respond  to  this  move,  as 
there  are  no  6-labelled  arrows  leading  out  of  state  V,  so  the  first  player 
wins. 

As  the  second  player  never  has  any  options  -  and  thus  he  can  never  have 
made  a  bad  move  -  this  defines  a  winning  strategy  for  the  first  player. 


^Example  12.2^ 

Consider  the  following  game  played  on  the  second  transition  system  in  Fig¬ 
ure  12.1  with  the  tokens  on  states  1  and  2,  where  we  assume  that  the  time 
limit  is  (at  least)  3. 

1.  The  first  player  starts  by  moving  the  token  on  state  1  along  the  arrow 
labelled  a  to  state  5.  In  response,  the  second  player  has  to  move  the 
token  on  state  2  along  an  arrow  labelled  a;  there  are  three  ways  to  do 
this:  by  moving  the  token  to  state  2,  to  state  3,  or  to  state  6;  after 
some  thought,  he  chooses  to  move  the  token  to  state  6. 

2.  The  first  player  then  moves  the  token  on  state  6  along  the  arrow  la¬ 
belled  a  to  state  4.  In  response,  the  second  player  has  to  move  the 
token  on  state  5  along  an  arrow  labelled  a;  there  are  two  ways  to  do 
this:  by  moving  the  token  to  state  3  or  to  state  5;  he  chooses  to  move 
the  token  to  state  3. 

3.  The  first  player  then  moves  the  token  on  state  4  along  the  arrow  la¬ 
belled  b  to  state  5.  In  response,  the  second  player  has  to  move  the 
token  on  state  3  along  an  arrow  labelled  b;  however,  this  is  impossible, 
so  the  first  player  is  declared  to  be  the  winner. 

In  this  case,  the  first  player  was  lucky:  the  second  player  had  several  options 
open  to  him  in  response  to  the  moves  of  the  first  player,  and  he  simply  chose 
poorly.  Had  the  second  player  responded  to  the  opening  1  A  5  move  of  the 
first  player  by  making  the  move  2  A  2,  he  could  then  have  responded  to  all 
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subsequent  moves  of  the  first  player.  In  fact,  the  second  player  has  a  winning 
strategy  in  this  game.  This  fact  will  be  made  evident  in  Section  12.4 


For  such  a  simply-defined  game,  the  fact  that  there  is  no  possibility  of  a 
draw  implies  that  one  of  the  two  players  has  a  winning  strategy.  This  fact 
is  embodied  in  the  following. 

(^Theorem  12.2 Q _ 

For  any  game  Gn(E,  F )  or  G00(E,  F ),  either  the  first  player  has  a  winning 
strategy,  or  the  second  player  has  a  winning  strategy. 


(^Exercise  12.2)  (Solution  on  page  470) _ 

Prove  Theorem  12.2  for  finite  games  Gn(E,  F ),  by  induction  on  n. 


Induction  cannot  be  used  to  prove  the  result  for  infinite  games  G00(E,  F ). 
Its  proof  is  left  as  an  exercise  at  the  end  of  the  chapter  (Exercise  6,  page  330). 

(^Definition  12.2} _ 

We  say  that  two  process  states  E  and  F  are  n-game  equivalent,  written 
E  F,  if,  and  only  if,  the  second  player  has  a  winning  strategy  in  the 
game  Gn(E,  F).  Similarly,  we  say  that  E  and  F  are  co-game  equivalent, 
written  E  ~OQ  F,  if,  and  only  if,  the  second  player  has  a  winning  strategy 
in  the  game  G00(E,  F). 


For  example,  if  we  again  consider  the  three  vending  machines  from  Sec¬ 
tion  11.4,  we  can  note  that  their  starting  states  are  pairwise  2-game  equiv¬ 
alent  but  pairwise  not  3-game  equivalent. 

1.  V)  ~2  Vj  for  i,j  e  {1,2,3}. 

The  second  player  has  a  winning  strategy  in  the  game  which  ends  after 
the  exchange  of  only  two  moves,  as  all  three  machines  start  with  two 
consecutive  lOp  transitions. 

2.  Vi  /8  Va  and  Vi  7^3  V3. 

The  first  player  has  a  winning  strategy  in  the  game  which  lasts  for 
three  exchanges  of  moves,  namely  to  play  arbitrarily  for  the  first  two 
exchanges  of  moves,  and  then  to  take  the  transition  in  the  Vi  process 
(coffee  or  tea)  which  is  not  available  to  the  other  process.  The  second 
player  will  be  stuck  at  this  point  and  lose  the  game. 

3.  Vi  7^3  V3. 

The  first  player  has  a  winning  strategy  in  the  game  which  lasts  for 
three  exchanges  of  moves,  namely  to  open  with  the  transition  in  the 
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V3  process  towards  the  coffee  transition,  and  then  in  the  second  move 
to  take  the  transition  in  the  V2  process  towards  the  tea  transition.  The 
first  player  can  then  take  the  tea  transition  in  the  V2  process,  which 
the  second  player  cannot  respond  to. 


(^Exercise  12. 3J  (Solution  on  page  471) _ 

Recall  the  following  processes  from  Exercise  11.16. 

A  =  b.c.  0  +  b.d.  0  C  =  a.B  +  a. A 

B  =  A  +  b.(c.  0  +  d.  0)  D  =  a.B 

For  which  n  do  we  have  that  C  D ?  Justify  your  answer. 


(l2.2)  Properties  of  Game  Equivalence 


In  this  section,  we  explore  various  properties  of  game  equivalences,  beginning 
with  the  following  characterisation  which  in  particular  provides  an  elegant 
inductive  definition  of  finite  game  equivalences. 


(^Theorem  12.3^) 


1.  E  ~0  P  for  all  processes  E  and  F. 

2.  E  ~n+i  F  if,  and  only  if, 

•  if  E  -A  E1  then  F  A  F'  for  some  F‘  such  that  E'  F' ;  and 

•  if  F  A  F1  then  E  A  E1  for  some  E'  such  that  E1  F' . 

3.  E  ~oo  F  if,  and  only  if, 

•  if  E  A  E‘  then  F  A  F'  for  some  F'  such  that  E'  F1;  and 

•  if  F  A  F'  then  E  A  E1  for  some  E'  such  that  E1  ~oo  F' . 

Pictorially,  2.  and  3.  can  be  represented  as  follows: 

^n+l  ^00 

E  ""  ~~~  F  E'"  ~~~  F 


a 

a 

a 

t 

n 

^00 

E’  ~~  F’  E’  ~~  F’ 


Proof:  The  first  result  about  0-game  equivalence  is  trivially  true,  as  the 
second  player  is  immediately  declared  to  be  the  winner  of  any  game  which 
lasts  for  only  0  exchanges  of  moves. 
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For  the  second  result,  we  note  that  the  second  player  has  a  winning 
strategy  in  the  game  Gn+1(E,  F)  if,  and  only  if,  regardless  of  what  move  the 
first  player  makes  -  either  E  -4  E'  or  F  -4  F'  -  the  second  player  can  make 
a  response  -  either  F  -4  F'  or  E  -4  E'  -  in  such  a  way  that  he  still  has 
a  winning  strategy  in  the  game  Gn(E',F').  But  this  is  precisely  what  the 
statement  in  the  theorem  says. 

Similarly  for  the  third  result,  we  note  that  the  second  player  has  a  win¬ 
ning  strategy  in  the  game  G0O(E,  F )  if,  and  only  if,  regardless  of  what  move 
the  first  player  makes  -  either  £  4  £'  or  F  4  F'  -  the  second  player  can 
make  a  response  -  either  F  -4  F'  or  E  -4  E'  -  in  such  a  way  that  he  still 
has  a  winning  strategy  in  the  game  G0O(E',  F1).  Again  this  is  precisely  what 
the  statement  in  the  theorem  says.  □ 

We  can  use  Theorem  12.3  to  prove  that  these  game  equivalence  relations 
are  indeed  equivalence  relations. 

(^Theorem  12.4) _ 

The  relations  and  are  all  equivalence  relations. 


Proof:  To  show  that  the  relations  and  are  reflexive,  that  is,  that 
E  E  and  E  E  for  all  E,  we  need  to  prove  the  following. 

The  second  player  has  a  winning  strategy  in  any  game  in  which  the 
two  tokens  start  on  the  same  state  E  of  some  transition  system. 

This  is  obvious,  as  the  second  player  need  merely  copy  every  move  of  the 
first  player;  wherever  the  first  player  moves  one  of  the  tokens,  the  second 
player  moves  the  other  token  to  the  same  place. 

To  show  that  the  relations  and  ^  are  symmetric,  that  is,  that 
F  E  whenever  E  F  and  that  F  E  whenever  E  F,  we  need 
to  prove  the  following. 

If  the  second  player  has  a  winning  strategy  in  a  game  in  which  the 
tokens  start  on  states  E  and  F  of  some  transition  system,  then  he 
also  has  a  winning  strategy  in  the  same  game  but  with  the  tokens 
starting  on  states  F  and  E. 

Again  this  is  obvious,  due  to  the  symmetry  of  the  game.  The  second  player 
need  merely  use  (essentially)  the  same  winning  strategy. 

To  show  that  the  relations  and  are  transitive,  that  is,  that  E  G 
whenever  E  F  and  F  G,  and  that  E  G  whenever  E  F  and 
F  ~oo  G,  we  need  to  prove  the  following. 

If  the  second  player  has  a  winning  strategy  in  a  game  in  which 
the  tokens  start  on  states  E  and  F  of  some  transition  system,  and 
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he  has  a  winning  strategy  m  the  same  game  but  with  the  tokens 
starting  on  states  F  and  G,  then  he  also  has  a  winning  strategy 
in  the  same  game  but  with  the  tokens  starting  on  states  E  and  G. 

The  details  of  this  are  left  as  an  exercise.  □ 


(^Exercise  12.4^)  (Solution  on  page  471) 


Prove,  by  induction  on  n,  that  the  relation  ~n  is  transitive  for  each  n. 


That  the  relation  ~0D  is  transitive  cannot  be  proved  by  induction,  but  is 
proven  in  Exercise  12.7  (page  317). 

(^Theorem  1 2 . 5) 

The  relations  and  ^  are  strictly  decreasing:  ~0  ~i  3  ~2  3  •  •  O 
In  particular,  if  E  F  then  E  F  for  all  k  <n. 


Proof:  If  the  first  player  has  a  winning  strategy  in  a  game  of  length  n,  then 
she  can  use  that  strategy  to  win  any  game  with  a  longer  time  limit  (and  in 
particular,  the  game  with  no  predetermined  finite  time  limit).  Alternatively, 
if  the  second  player  has  a  winning  strategy  in  a  game  of  length  n,  or  one 
with  no  predetermined  finite  time  limit,  then  he  can  use  that  strategy  to 
win  any  game  with  a  shorter  time  limit.  This  demonstrates  the  sequence  of 
inclusions  of  the  relations:  if  a  pair  of  states  is  in  it  will  be  in  for  all 
j<i,  and  hence  ~0  D  ~i  D  ~2  D  •  •  •  D  ~ <*, . 

That  these  inclusions  are  strict  can  be  noted  by  observing  that  for  all 
n  £  N,  Cl„  ~n  Cl  but  Cl„  9^71+1  Cl;  and  that  for  all  n  £  N,  Clock  Clock* 
but  Clock  9^00  Clock*,  where  these  clock  processes  were  defined  in  Exam¬ 
ple  11.11  (page  296).  □ 


Exercise  12.5 J  (Solution  on  page  472) _ 

Prove  the  above  claims,  that  for  all  n  e  N,  Cl„  Cl  but  Cl„  9^+1  Cl;  and 

that  for  all  n  G  N,  Clock  Clock*  but  Clock  Clock*. 


(12.3)  Bisimulation  Relations 

We  might  expect  ^  to  be  the  “limit”  of  the  relations,  that  is,  that  the 
second  player  should  have  a  winning  strategy  in  the  infinite  game  whenever 
he  has  a  winning  strategy  for  arbitrarily-long  finite  games.  Alas,  the  above 
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example  disproves  this  intuition,  as  the  two  clocks  Clock  and  Clock*  are 
n-game  equivalent  for  all  n  but  they  are  not  infinite-game  equivalent. 

Clearly  these  two  clocks  cannot  be  considered  to  be  the  same;  the  first  one 
is  guaranteed  to  stop  after  some  indeterminate  number  of  ticks,  whereas  the 
latter  has  the  potential  to  tick  forever.  Infinite-game  equivalence  is  thus  the 
relation  we  wish  to  consider  as  defining  equivalence  between  processes,  and 
we  shall  henceforth  generally  refer  to  it  as  equivalence  rather  than  infinite- 
game  equivalence;  that  is,  when  we  declare  that  two  processes  are  equivalent, 
we  shall  mean  that  they  are  infinite-game  equivalent. 

If  our  intuition  had  been  right,  then  to  demonstrate  that  two  processes 
were  equivalent  we  could  exploit  Theorem  12.3  and  use  induction  to  prove 
them  to  be  n-game  equivalent  for  all  n  6  N.  However,  in  general  we  need 
an  alternative  proof  strategy  to  induction.  Motivated  by  Theorem  12.3(3), 
we  define  the  following  notion  to  capture  the  essence  of  a  winning  strategy 
for  the  second  player  in  an  infinite  game. 

(^Definition  12.5^) 

A  bisimulation  relation  is  a  binary  relation  TZ  over  states  which  satisfies 
the  following  property:  if  ETZF  then 

•  if  E  A  E1  then  F  -A  F1  for  some  F'  such  that  E'TZF';  and 

•  if  F  A  F'  then  E  -A  E'  for  some  F'  such  that  E'TZF' . 

We  can  represent  this  situation  pictorially  as  follows: 

n 

E  "  F 
a  a 

1  n  * 

E'  "  ~~  F' 


As  desired,  a  bisimulation  relation  TZ  represents  a  winning  strategy  for 
the  second  player  in  an  infinite  game:  whenever  the  two  tokens  are  on  states 
which  are  related  by  TZ,  the  second  player  can  match  any  move  of  the  first 
player  in  such  a  way  as  to  ensure  that  the  tokens  once  again  end  up  on 
states  related  by  TZ.  In  this  way,  the  second  player  can  repeatedly  match 
the  moves  of  the  first  player  ad,  infinitum. 

(^Theorem  1 2 . 6J 

The  second  player  has  a  winning  strategy  in  an  infinite  game  with  the  tokens 
starting  on  states  E  and  F  if,  and  only  if,  ETZF  for  some  bisimulation 
relation  TZ.  Hence  in  particular,  TZ  C  for  any  bisimulation  relation  TZ. 

Proof:  If  ETZF  for  some  bisimulation  TZ,  then  the  second  player  can  merely 
use  the  winning  strategy  represented  by  TZ  as  outlined  above  in  order  to  win 
the  infinite  game  with  the  tokens  starting  on  states  E  and  F. 
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Conversely,  by  Theorem  12.3(3),  the  relation  ~oo  itself  is  a  bisimulation 
relation.  Hence,  if  the  second  player  has  a  winning  strategy  in  an  infinite 
game  with  the  tokens  starting  on  states  E  and  F,  then  ETZF  for  the  bisim¬ 
ulation  relation  TZ  =  ~oo.  □ 


(^Example  12.6^) 

Consider  the  two  transition  systems  in  Figure  12.2.  It  is  straightforward 
to  confirm,  from  Definition  12.5,  that  the  following  binary  relation  is  a 
bisimulation  relation: 

K  =  {  (Pi,  Qi),  (P2,  Q2),  (P2,  Q 4),  (P3,  Qa),  (Pa,  Qa)  }• 

As  (Pi,  Qi)  e  TZ,  by  Theorem  12.6  we  get  that  Pj  Qj. 


Exercise  12.6 


(Solution  on  page  473) 


Prove  that  the  relation  TZ  in  Example  12.6  is  a  bisimulation  relation. 


(^Exercise  12.7^)  (Solution  on  page  473) 


Prove  that  if  TZ  and  S  are  bisimulation  relations  over  the  states  of  a  labelled 
transition  system,  then  so  is  P  o  5.  Infer  from  this  that  is  a  transitive: 
that  if  E  ~oo  F  and  F  ~oo  G  then  E  G. 


As  a  final  observation  regarding  the  relationship  between  the  finite-game 
equivalences  and  the  infinite-game  equivalence  we  note  that  the 
reason  ^  flneN  in  the  case  of  the  two  clocks  -  that  is,  that  we  can 

have  Clock  Clock*  for  all  n  G  N  but  Clock  9^  Clock*  -  is  solely  due  to 


318  Distinguishing  Between  Processes 


the  fact  that  these  clocks  can  perform  their  initial  tick  action  in  infinitely- 
many  ways,  leading  to  infinitely-many  states.  If  this  were  not  the  case,  then 
the  relations  would  coincide.  This  is  made  precise  as  follows. 

(Definition  12.7^) _ 

A  process  is  image-finite  if,  and  only  if,  for  every  state  E  of  the  process, 
and  for  every  label  a,  the  set  {F  :  E  F}  is  finite. 

(^Theorem  1 2 .  Ij 

For  image-finite  processes,  ^oo  =  flneN  ~n- 

Proof:  Inclusion  in  one  direction,  C  f]neN  ~n,  is  guaranteed  by  Theo¬ 
rem  12.5:  ~oo  C  ~n  for  all  n  e  N,  so  ~os  C  fneN  ~n- 

To  show  inclusion  in  the  other  direction,  fneN  ~n  G  ~oo,  it  suffices  to 
prove  that  the  relation  TZ  =  flneN  ~n  is  a  bisimulation  relation,  for  then  by 
Theorem  12.6  we  would  have  that  TZ  C  as  desired. 

To  this  end,  let  ETZF  be  an  arbitrary  pair  of  states  related  by  TZ,  that 
is,  E  F  for  all  n  G  N.  Assume  first  that  E  -A  E1.  Since  E  ~n+1  F 
for  all  n  e  N,  by  Theorem  12.3(2)  we  have  that  for  each  n  6  N,  F  A  Fn 
for  some  Fn  with  E’  Fn.  However,  by  image-finiteness  there  can  be 
only  finitely-many  such  Fn.  Hence  the  same  state  F'  must  appear  as  Fn  for 
infinitely-many  values  of  n;  that  is,  F  -A  F1  with  E'  F'  for  infinitely- 
many  n  e  N,  and  hence  by  Theorem  12.5  for  all  n  e  N.  Hence  E'TZF1 . 

By  a  symmetric  argument,  we  can  show  that  if  F  -A  F'  then  E  -A  E' 
for  some  E'  with  E'TZF'.  Hence  TZ  is  indeed  a  bisimulation.  □ 

(^Exercise  12. 8()  (Solution  on  page  474) _ 

In  the  definition  of  the  bisimulation  game,  the  first  player  was  free  to  move 
either  token  at  each  move.  Suppose  instead  she  must  always  move  the  same 
token  with  each  move.  For  example,  if  for  her  first  move  she  moves  the 
token  on  state  F,  then  she  must  always  move  that  token  in  every  move;  at 
no  time  can  she  switch  and  move  the  token  which  started  on  state  E.  Let 
E  x„  F  if,  and  only  if,  the  second  player  has  a  winning  strategy  in  this  new 
game  played  for  at  most  n  rounds  (where  n  may  be  oo). 

1.  Show  that  x„  is  an  equivalence  relation. 

2.  Show  that  E  F  implies  E  x„  F.  That  is,  if  the  second  player  has 
a  winning  strategy  in  the  bisimulation  game,  then  he  has  a  winning 
strategy  in  this  new  game. 

3.  Show  that  E  x„  F  in  general  does  not  imply  that  E  F.  (Hint: 
consider  the  processes  a.b.O  and  a. 6.0  +  a.O.) 
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(l2.4)  Bisimulation  Colourings 

Given  that  we  cannot  in  general  employ  the  inductive  characterisation  for 
finite-game  equivalences  to  prove  that  two  process  states  are  (infinite-game) 
equivalent,  we  here  devise  an  alternative  approach  to  inferring  if  and  when 
a  winning  strategy  exists  for  the  second  player  in  an  infinite  game.  The 
approach  relies  on  colouring  the  states  of  the  process  in  a  particular  fashion, 
thus  partitioning  the  states  into  equivalence  classes  defined  by  colour. 

(^Definition  12. 8y 

A  bisimulation  colouring  of  a  transition  system  is  a  colouring  of  the  states 
which  satisfies  the  following  property: 

If  some  state  with  some  colour  C  has  a  transition  leading  out  of  it 
into  a  state  with  some  colour  C",  then  every  state  coloured  C  has  an 
identically-labelled  transition  leading  out  of  it  into  a  state  coloured  C". 

For  example,  if  some  red  state  has  an  a-transition  leading  to  a  blue 
state,  then  every  red  state  has  an  a-transition  leading  to  a  blue  state. 

That  is  to  say,  if  E  and  F  have  the  same  colour,  then 

•  if  E  -A  E1  then  F  A  F'  for  some  F'  such  that  E'  and  F'  have  the 
same  colour;  and 

•  if  F  —>  F'  then  E  A  E1  for  some  E‘  such  that  E'  and  F'  have  the 
same  colour. 

Two  states  E  and  F  are  bisimulation  equivalent  or  bisimilar,  written 
E  ~  F ,  if  they  have  the  same  colour  in  some  bisimulation  colouring. 


As  a  trivial  example,  if  we  assign  each  state  its  own  unique  colour,  then 
this  would  clearly  be  a  bisimulation  colouring.  However,  finding  a  bisimu¬ 
lation  colouring  which  assigns  the  same  colour  to  two  different  states  allows 
us  to  conclude  that  these  two  states  are  equivalent.  This  fact  is  recorded  in 
the  following. 

(^Theorem  12. 8) _ 

E  ~  F  if,  and  only  if,  E  ~oo  F. 


Proof:  Given  a  bisimulation  colouring  of  a  transition  system,  the  binary 
relation  TZ  which  relates  like-coloured  states  is  clearly  a  bisimulation  relation 
(according  to  Definition  12.5),  and  hence,  by  Theorem  12.6,  any  two  like- 
coloured  states  must  be  infinite-game  equivalent.  That  is,  if  E  ~  F  (i.e. ,  E 
and  F  have  the  same  colour  in  some  bisimulation  colouring)  then  E  F. 
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Conversely,  consider  colouring  a  transition  system  in  such  a  way  that  any 
two  states  E  and  F  have  the  same  colour  if,  and  only  if,  E  F.  By  Theo¬ 
rem  12.3(3),  this  colouring  is  clearly  a  bisimulation  colouring  (according  to 
Definition  12.8).  Thus,  if  E  F  then  E  and  F  have  the  same  colour  in 
this  bisimulation  colouring,  and  hence  E  ~  F .  □ 

This  new  characterisation  of  equivalence  gives  rise  to  the  following  ap¬ 
proach  to  demonstrating  that  two  states  of  a  transition  system  are  (or  are 
not)  equivalent.  We  start  with  all  states  being  the  same  colour  (white,  say), 
and  refine  this  colouring,  always  maintaining  the  following  invariant: 

phnvariant:  If  E  ~  F  then  E  and  F  have  the  same  colourT^ 

In  this  way,  we  start  with  a  single  equivalence  class  of  states  (ie,  start  with  all 
states  assigned  the  same  colour),  and  refine  this  partition  by  subdividing  the 
equivalence  classes  (by  assigning  some  of  the  states  in  an  equivalence  class 
a  new  colour).  This  partition  refinement  algorithm  can  be  effectively 
implemented  to  prove  (or  disprove)  equivalences. 

As  an  illustrative  example,  consider  the  second  transition  system  of  Fig¬ 
ure  12.1. 


The  initial  all-white  colouring  is  not  a  bisimulation  colouring,  as  the  white 
state  4  has  a  6-transition  to  a  white  state  5,  whereas  the  other  white  states 

1,  2,  3,  5  and  6  do  not  have  6-transitions  to  white  states.  Hence,  by  the 
invariant,  state  4  cannot  be  equivalent  to  the  other  white  states;  in  any 
bisimulation  colouring,  state  4  must  have  a  different  colour  from  states  1, 

2,  3,  5  and  6.  Hence  we  may  safely  refine  our  colouring  by  making  state  4  a 
different  colour  (black,  say). 
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This  is  still  not  a  bisimulation  colouring,  as  the  white  states  3  and  6  have 
a-transitions  to  black  states,  whereas  the  other  white  states  1,  2  and  5  do 
not.  Hence,  by  the  invariant,  states  3  and  6  cannot  be  equivalent  to  the 
other  white  states;  in  any  bisimulation  colouring,  states  3  and  6  must  have 
a  different  colour  from  states  1,  2  and  5.  Hence  we  may  safely  refine  our 
colouring  by  making  states  3  and  6  a  different  colour  (grey,  say). 


This  colouring  is  a  bisimulation  colouring,  which  by  construction  satisfies 
our  invariant.  To  confirm  this,  we  merely  enumerate  the  possibilities. 

1.  every  white  states  has  an  a-labelled  arrow  leading  into  a  white  state, 
and  an  a-labelled  arrow  leading  into  a  grey  state; 

2.  every  grey  state  has  an  a-labelled  arrow  leading  into  a  grey  state,  and 
an  a-labelled  arrow  leading  into  a  black  state;  and 

3.  every  black  state  has  a  6-labelled  arrow  leading  into  a  white  state. 

Hence,  two  states  in  this  transition  system  are  equivalent  if,  and  only  if, 
they  have  the  same  colour. 

For  the  first  transition  system  in  Figure  12.1,  a  little  reflection  reveals 
that  no  bisimulation  colouring  of  the  states  of  this  transition  system  exists 
in  which  the  states  X  and  U  have  the  same  colour. 


Exercise  12.9 J  (Solution  on  page  475) _ 

Prove  the  above  claim  that  the  states  X  and  U  of  the  first  transition  system 
in  Figure  12.1  cannot  have  the  same  colour  in  any  bisimulation  colouring. 


This  completes  the  outline  of  our  algorithm  for  determining  whether 
two  states  of  a  transition  system  are  equivalent.  The  algorithm  works  by 
partitioning  the  states  into  equivalence  classes,  by  starting  with  the  trivial 
partition  consisting  of  a  single  class  containing  all  states,  and  repeatedly 
refining  the  partition  by  splitting  one  of  the  classes  into  two  separate  sub¬ 
classes;  it  does  this  when  it  discovers  that  none  of  the  states  of  one  of 
the  new  sub-classes  can  be  equivalent  to  any  of  the  states  of  the  other. 
If  we  carry  this  procedure  out  on  a  transition  system  with  n  states,  then 
clearly  it  can  perform  no  more  than  n  refinements,  as  each  refinement  gives 
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rise  to  a  new  class  and  we  cannot  produce  a  partition  with  more  than  n 
classes.  Furthermore,  during  each  iteration  we  need  only  scan  the  edges 
of  the  transition  system  looking  for  a  transition  with  which  we  can  split 
a  partition.  Hence  if  there  are  k  edges  in  the  transition  system,  then  this 
naive  implementation  of  the  algorithm  would  execute  in  time  proportional 
to  nk. 

As  a  useful  by-product,  this  algorithm  produces  a  minimal-sized  (in 
terms  of  the  number  of  states)  transition  system  which  is  equivalent  to  the 
original  transition  system.  In  the  above  example,  the  minimal-sized  transi¬ 
tion  system  has  three  states,  which  we  might  refer  to  as  white,  black  and 
grey,  and  is  depicted  as  follows. 


Exercise  12.10 j  (Solution  on  page  475) _ 

Carry  out  the  above  bisimulation  colouring  algorithm  on  the  first  transition 
system  of  Figure  12.1,  explaining  each  step  in  detail  as  above. 

Note  that  the  algorithm  is  nondeterministic;  there  may  be  several  ways  of 
splitting  a  set  of  like-coloured  states.  For  example,  starting  with  all  states 
of  the  transition  system  in  question  white,  there  are  three  possible  ways  to 
proceed. 

1.  White  states  U  and  X  both  have  an  a-transition  leading  to  a  white 
state,  while  white  states  V,  W,  Y  and  Z  do  not. 

2.  White  states  V  and  Y  both  have  a  6-transition  leading  to  a  white  state, 
while  white  states  U,  W,  X  and  Z  do  not. 

3.  White  states  W  and  Y  both  have  a  c-transition  leading  to  a  white 
state,  while  white  states  U,  V,  X  and  Z  do  not. 

It  doesn’t  matter  which  choice  you  make;  the  end  result  will  be  the  same. 


-k  (l2.5)  The  Bisimulation  Game  Revisited:  To  Infinity  and 
Beyond! 

As  we  observed,  the  relations  representing  n-game  equivalence  do  not, 
in  general,  provide  an  adequate  sequence  of  approximations  to  the  00- 
game  equivalence.  This  was  demonstrated  in  Exercise  12.5  by  the  example 
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of  the  clocks,  in  which  Clock  Clock*  for  all  n  6  N  but  Clock 
Clock*.  All  is  not  lost  with  the  idea  of  approaching  ^  by  a  sequence  of 
approximations.  The  solution  -  which  seems  very  odd  on  first  encountering 
it  -  is  to  take  the  advice  of  Buzz  Lightyear  and  go  to  infinity  and  beyond. 
The  example  of  the  clocks  shows  that  it  is  not  enough  just  to  go  to  infinity 
through  the  natural  numbers  ~0|  ~2,  ~3, . . ..  All  we  need  to  do  is  make 

sense  of  the  idea  of  going  beyond  infinity. 

Consider  two  young  children  playing  the  game  of  “Who  can  name  the 
largest  number?”  in  which  they  take  turns  naming  larger  and  larger  num¬ 
bers.  They  quickly  run  up  against  the  problem  of  what  numbers  come 
after  one-million,  one-billion,  one-trillion,  one- quadrillion,  . . .,  until  one 
of  them  discovers  the  number  googol  (lO100,  a  one  followed  by  100  zeros); 
but  the  other  quickly  responds  with  an  even  bigger  number:  googol-plus- 
one\  The  first  child’s  argument  that  “There’s  no  such  thing,  googol  is 
the  biggest  number!"  is  of  course  wrong.  But  then  eventually,  one  of  the 
children  names  the  “number”  infinity. 

It  is  possible  to  accept  the  idea  of  naming  infinity  as  a  number,  which 
by  definition  is  bigger  than  any  natural  number,  and  even  to  give  it  its  own 
symbol:  cu.  But  we  will  then  be  able  to  consider  u+1  as  a  bigger  number, 
and  oj+2  as  an  even  bigger  number,  and  even  uj+ui  as  a  far,  far  greater 
number;  these  are  all  infinitely-big  numbers,  but  some  are  just  bigger  than 
others. 

We  already  noted  in  Section  6.4,  when  comparing  the  sizes  (cardinal¬ 
ities)  of  sets,  that  infinity  comes  in  different  varieties;  in  particular,  the 
cardinality  of  the  set  of  rational  numbers  is  the  same  as  the  cardinality  of 
the  set  of  natural  numbers  (Exercise  6.16)  but  strictly  smaller  than  the  car¬ 
dinality  of  the  set  of  reals  (Example  6.16).  Infinite  counting  numbers  (as 
opposed  to  measuring  numbers)  also  exist  as  mathematical  objects,  and  are 
collectively  known  as  ordinal  numbers.  These  are  what  will  allow  us  to 
approximate  ~ <*, . 

12.5.1  Ordinal  Numbers 

The  ordinal  numbers  are  an  extension  of  the  natural  numbers  as  motivated 
above.  The  initial  segment  of  ordinals  is  as  follows: 

0,  1,  2,  ...,oj,  CJ+ 1,  uj-\- 2,  ...,  (jJ+oj,  oj-{-uj- |-1,  |—  2 ,  ... 

Thus,  after  all  finite  ordinals  have  been  listed  (the  natural  numbers),  the  first 
infinite  ordinal  w  is  listed,  and  we  can  once  again  list  ever-bigger  ordinals  by 
successively  adding  one;  after  adding  each  natural  number  to  ui  we  reach  the 
ordinal  uj+uj,  or  ojx2,  from  which  we  continue  the  scheme,  ad  infinitum. 
The  collection  of  ordinal  numbers  is  denoted  by  O.  We  shall  not  concern 
ourselves  with  the  complete  theory  of  ordinal  numbers.  All  we  will  need  to 
know  about  ordinals  are  the  following  four  facts: 
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1.  Every  ordinal  X  has  a  successor  X+l,  whose  predecessor  is  X. 

2.  An  ordinal  is  either:  zero  (i.e.,  0);  or  a  successor  ordinal  (i.e.,  X+l  for 
some  ordinal  X)\  or  a  limit  ordinal  which  has  a  value  which  is  greater 
than  all  previous  ordinals,  but  has  no  predecessor. 

The  first  limit  ordinal  -  which  is  the  first  infinite  ordinal  -  is  u.  It 
is  the  smallest  ordinal  greater  than  any  finite  ordinal  (i.e.,  natural) 
number;  the  next  limit  ordinal  is  oj+oj  (which  is  also  written  u/x2), 
then  w+ui+uj  (or  wx3),  and  so  on. 

3.  Given  any  set  5  there  is  an  ordinal  X  £  O  which  represents  the  cardi¬ 
nality  of  5;  that  is,  there  is  a  bijection  between  the  set  5  and  the  set 
{Y  eO  :  Y  <  X}. 

4.  In  order  to  show  that  a  property  P(X )  holds  for  all  ordinals  X,  it 
suffices  to  show  the  following: 

P(X )  holds  for  X  whenever  P(Y )  holds  for  all  Y  <  X\  that  is, 
(VY  <  X  :  P(Y)}  =>  P{X). 

This  principle  is  known  as  transfinite  induction  and  is  a  restatement 
of  the  principle  of  strong  induction  from  Section  9.4. 

For  those  who  find  this  brief  initiation  into  the  world  of  ordinal  numbers 
confusing,  you  may  find  it  helpful  to  concentrate  on  the  natural  numbers, 
and  just  think  of  ui  whenever  limit  ordinals  are  mentioned  in  what  follows. 

^Example  12.1(T) 

Consider  the  set  Nx  N  of  pairs  of  natural  numbers  ordered  lexicographically: 
(i,  j)  <  (p,  q)  if,  and  only  if,  either  i  <  p  or  i  =  p  and  j  <  q.  Thus,  we  can 
list  these  out  in  order  as  follows: 


(0,0)  <  (0,1)  <  (0,2)  < 


< 

(1.0) 

< 

(1.1) 

< 

(1,2) 

< 

< 

(2,0) 

< 

(2,1) 

< 

(2,2) 

< 

< 

(3,  0) 

< 

(3,1) 

< 

(3,2) 

< 

< 

(4,  0) 

< 

(4,1) 

< 

(4,2) 

< 

< 

This  gives  us  a  way  to  view  the  start  of  the  list  of  ordinal  numbers,  namely 
by  associating  the  pair  (i,  j)  e  N  x  N  with  the  ordinal  number  uxi  +  j. 


12.5.2  Ordinal  Bisimulation  Games 

In  Section  12.1  we  defined  the  bisimulation  game  as  either  lasting  for  a 
predefined  finite  number  n  of  exchanges  of  moves,  denoting  the  game  by 
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Gn(E,F );  or  as  continuing  for  as  long  as  each  player  can  make  a  move, 
denoting  the  game  in  this  case  by  GX(E,  F ).  We  can  refine  this  notion  by 
defining  the  game  GX(E,  F )  for  any  ordinal  number  X.  From  now  on,  we 
shall  use  GX(E,F )  to  denote  both  the  game  itself  as  well  as  the  position 
that  the  game  is  in,  with  X  denoting  in  a  precise  sense  the  length  of  the 
game. 

1.  From  position  G0(E,  F ),  the  second  player  is  declared  to  be  the  winner. 

This  reflects  the  idea  that  the  second  player  automatically  wins  any 
game  of  length  0,  as  he  need  not  copy  any  moves  of  the  first  player. 

2.  From  position  GX+1(E,  F),  the  two  players  exchange  moves  once  as 
usual,  and  the  play  continues  from  GX(E',F'),  where  E'  and  F'  are 
the  states  to  which  the  two  tokens  have  been  moved. 

This  reflects  the  usual  idea  that  a  game  of  length  X+l  consists  of  a 
single  exchange  of  moves  followed  by  a  game  of  length  X. 

3.  From  position  GX(E,  F )  where  A  is  a  limit  ordinal,  the  first  player 
chooses  a  value  X<\ ,  and  the  play  continues  from  position  GX(E,  F ). 

This  reflects  the  idea  that  GX(E,  F )  encompasses  all  games  of  length 
less  than  A;  that  is,  GX(E,  F )  for  any  X  <  A.  If  the  second  player  has 
a  winning  strategy  in  all  such  shorter  games  then  he  can  force  a  win  in 
any  such  game  that  the  first  player  chooses,  so  the  second  player  can 
force  a  win  in  this  game  GX(E,  F).  However,  if  the  first  player  has  a 
winning  strategy  in  some  such  shorter  game,  then  she  can  choose  that 
game  and  use  her  winning  strategy  to  win  the  game  GX(E,  F). 

The  following  result  corresponds  to  Theorem  12.2  (page  312),  and  is 
similarly  proved  but  by  transfinite  induction  rather  than  simple  induction 
over  the  natural  numbers. 

(^Theorem  12. 1(f) _ 

For  any  game  GX(E,  F ),  either  the  first  player  has  a  winning  strategy,  or 
the  second  player  has  a  winning  strategy. 


Proof:  By  transfinite  induction.  For  the  case  X  =  0,  the  second  player 
clearly  has  a  winning  strategy  for  the  game  G0(E,  F ). 

Suppose  that  X  =  Y+l  is  a  successor  ordinal,  and  that  for  any  game 
Gy(E',  F ')  one  of  the  two  players  has  a  winning  strategy. 

Suppose  that  X  =  Y+l  is  a  successor  ordinal,  and  that  for  any  game 
Gy(E',  F')  one  of  the  two  players  has  a  winning  strategy.  The  argument 
that  one  of  the  two  players  has  a  winning  strategy  in  the  game  GY+1(E,  F ) 
is  identical  to  the  induction  step  in  the  proof  of  Theorem  12.2. 

•  Suppose  that  no  matter  what  the  first  player  does  as  her  first  move 
in  the  game  GY+1(E,  F ),  the  second  player  can  respond  in  such  a  way 
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that  he  gets  into  a  position  in  which  he  has  a  winning  strategy  in  the 
game  of  length  Y .  This  clearly  provides  a  winning  strategy  for  the 
second  player  in  the  game  GY+1(E,  F). 

•  Hence,  if  the  second  player  does  not  have  a  winning  strategy  in  the 
game  GY+1(E,  F ),  then  the  first  player  can  make  a  move  in  such  a  way 
that  any  response  the  second  player  makes  results  in  a  position  from 
which  the  second  player  does  not  have  a  winning  strategy  in  the  game 
of  length  Y ;  but  then  by  the  inductive  hypothesis,  the  first  player  has  a 
winning  strategy  in  the  game  of  length  Y  from  this  resulting  position, 
which  means  she  has  a  winning  strategy  for  the  game  GY+1(E,  F ). 

Suppose  finally  that  X  is  a  limit  ordinal,  and  that  for  any  game  GY(E',  F1) 
with  Y  <  X  one  of  the  players  has  a  winning  strategy. 

•  If  there  is  some  Y  <  X  such  that  the  first  player  has  a  winning  strategy 
in  the  game  GY(E,  F ),  then  she  can  choose  this  value  Y  <  X  and  use 
this  winning  strategy  to  win  the  game  GX(E,  F). 

•  If  there  is  no  Y  <  X  such  that  the  first  player  has  a  winning  strategy 

in  the  game  GY(E,  F),  then  by  the  induction  hypothesis  the  second 
player  has  a  winning  strategy  for  the  game  GY(E,  F)  for  each  Y  <  X, 
and  hence  a  winning  strategy  for  the  game  GX(E,  F).  □ 


(^Definition  12.10^) 

We  say  that  two  process  states  E  and  F  are  X-game  equivalent,  written 
E  F,  if,  and  only  if,  the  second  player  has  a  winning  strategy  in  the 
gameGx(E,  F). 


(^Example  12.11^) 

From  Exercise  12.5  we  know  that  Clock  Clock*  since  Clock  Clock* 
for  all  n  £  N  (i.e.,  Clock  ~ x  Clock*  for  all  X  <  ui). 

However,  Clock  9^+1  Clock*,  since  the  move  Clock*  Cl  by  the 

first  player  in  the  game  G„+i(Clock*,  Clock)  must  be  matched  by  a  move 

Clock  ^  Cl„  for  some  n  e  N,  but  for  no  n  G  N  do  we  have  that  Cl  Cl„. 
On  the  other  hand,  we  do  have  that  tick. Clock*  ~w+1  tick. Clock. 

(^Exercise  12. 11/)  (Solution  on  page  477) _ 

Give  process  states  En  and  Fn  such  that  En  ~„+n  Fn  but  En  /u+n+i  Fn. 


We  can  now  extend  the  results  of  Section  12.2  about  game  equivalence  to 
ordinal  game  equivalence.  We  leave  most  of  the  proofs  as  exercises,  as  they 
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are  straightforward  adaptations  of  the  proofs  of  the  analogous  results  pre¬ 
sented  in  Section  12.2.  However,  we  prove  the  last  result,  which  is  the  goal 
of  this  section:  that  the  sequence  of  equivalences  ~x  does  indeed  properly 
approximate 

(^Theorem  12.11?) _ 

1.  E  ~0  F  for  all  processes  E  and  F. 

2.  E  ~x+i  F  ff,  and  only  if, 

•  if  E  -4  E'  then  F  -4  F'  for  some  F'  such  that  E'  F';  and 

•  if  F  -4  F1  then  E  -4  E'  for  some  E'  such  that  E'  F'. 

3.  For  limit  ordinals  A,  E  F  if,  and  only  if,  E  ~ x  F  for  all  X  <  A. 


(^Theorem  12.1 2^) _ 

The  relations  ~x  are  all  equivalence  relations. 


(^Theorem  12.13^) 

The  relations  are  strictly  decreasing.  That  is,  ~x  C  ~y  whenever 
X  >  Y. 

Specifically,  if  for  each  ordinal  X  e  O  we  define  the  process 

Ex  =  Y,  a-Ev, 

Y<X 

then  for  all  ordinals  X  and  Y  with  X  <  Y,  Ex  ~ x  EY  but  Ex  ^x+1  Ey. 


(^Theorem  12.14) _ 

~oo  —  fl xeo 


Proof:  Suppose  that  E  F;  we  shall  show  by  transfinite  induction  that 
E  ~ x  F  for  all  X  e  O. 

•  If  X  =  0  then  clearly  E  ~0  F. 

•  Suppose  that  X  =  Y  +  1  is  a  successor  ordinal. 

—  If  E  -4  E'  then  F  -4  F1  for  some  F'  such  that  E'  F' ,  and 

hence  by  induction  E1  ~y  F1. 

-  If  F  4  F1  then  E  -4  E'  for  some  E'  such  that  E1  ~ot)  F' ,  and 
hence  by  induction  E'  ~y  F'. 
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Thus  we  must  have  that  E  ~y+1  F. 

•  Suppose  finally  that  X  is  a  limit  ordinal.  Then  by  induction  we  have 
that  E  F  for  all  Y  <  X,  and  hence  E  ~ x  F. 

To  show  inclusion  in  the  other  direction,  C\xeo  £  ~oo,  it  suffices  to 
prove  that  the  relation  ' R .  =  f)x€0  ~ x  is  a  bisimulation  relation,  for  then  by 
Theorem  12.6  we  would  have  that  72  C  ^  as  desired. 

To  this  end,  let  ETZF  be  an  arbitrary  pair  of  states  related  by  72,  that 
is,  E  F  for  all  X  E  O.  Assume  first  that  E  A  E'.  Since  E  ~x+i  F  for 
all  X  e  O,  by  Theorem  12.11(2)  we  have  that  for  each  X  e  O,  F  A  Fx 
for  some  Fx  with  E'  Fx.  The  set  {Fx  :  X  e  O}  can  be  no  greater 
that  the  set  of  all  states  which  are  reachable  from  the  state  F,  and  there 
are  ordinal  numbers  X  which  are  arbitrarily  larger  than  the  cardinality  of 
this  set  of  states.  Hence  there  must  be  some  state  F'  which  appears  as 
Fx  for  arbitrarily-large  values  of  X;  that  is,  F  A  F'  with  E'  F'  for 
arbitrarily-large  X  e  O,  and  hence  by  Theorem  12.13  for  all  X  e  O.  Hence 
E'TZF1. 

By  a  symmetric  argument,  we  can  show  that  if  F  -A  F'  then  E  -A  E' 
for  some  E1  with  E'TZF1 .  Hence  72  is  indeed  a  bisimulation.  □ 


(12. 6)  Additional  Exercises 

1.  Carry  out  the  bisimulation  colouring  algorithm  step-by-step  on  the 
labelled  transition  system  defined  by  the  following  process  definition, 
and  use  this  to  provide  an  equivalent  system  with  a  minimal  number 
of  states. 

W  =  b.X  +  c.Z  X  =  a.Y 

Y  =  c.X  +  b.Z  Z  =  a.W  +  a.Y 


2.  Carry  out  the  bisimulation  colouring  algorithm  step-by-step  on  the 
labelled  transition  system  defined  by  the  following  process  definition, 
and  use  this  to  provide  an  equivalent  system  with  a  minimal  number 
of  states. 


X1  =  a.Xi 

+ 

b.X 3 

X4  =  a.Xi 

+ 

b.X 3 

X2  =f  a.X3 

+ 

a.X6  +  b.Xt 

X5  =f  a.X2 

+ 

a.X6 

+ 

b.Xj 

X3  =  a.X5 

X6  =  a.X3 

+ 

a.X5 

+ 

b.X < 

3.  Consider  the  following  labelled  transition  system. 
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a  a 


(a)  Which  states  are  2-game  equivalent  to  state  X6 ? 

(b)  Which  states  are  2-game  equivalent,  but  not  3-game  equivalent, 
to  state  X(P. 

(c)  Which  states  are  n-game  equivalent  to  state  X5  for  all  n? 

4.  Consider  the  following  labelled  transition  system. 


(a)  For  which  n  do  we  have  Wi  ~n  XJ?  Justify  your  answer. 

(b)  For  which  n  do  we  have  Wj  ~n  Yi?  Justify  your  answer. 

(c)  For  which  n  do  we  have  Wj  ZJ  Justify  your  answer. 

(d)  For  which  n  do  we  have  Xi  ~n  Y{i  Justify  your  answer. 

(e)  For  which  n  do  we  have  X1  Z{1  Justify  your  answer. 

(f)  For  which  n  do  we  have  Y1  Z i?  Justify  your  answer. 

5.  Show  that  the  algebraic  laws  from  Section  11.5  are  true  of  bisimulation 
equivalence: 

(Si)  E  +  0  rv  E. 
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(52)  E  +  E  ~  E. 

(53)  £  +  X  ~  F  +  E. 

(St)  (E  +  F)  +  G  ~  E  +  (F  +  G). 

(S5)  If  X  =  E  then  X  ~  E. 

(Cj)  If  E  »y  j?  then  E  +  G  ~  X  +  G. 

(C2)  If  E  ~  F  then  a.E  ~  a.X. 

6.  Prove  that  the  following  binary  relation  on  the  states  of  a  labelled 
transition  system  is  a  bisimulation  relation: 

R  =  {  (E,  F)  :  the  first  player  does  not  have 

a  winning  strategy  in  the  game  Goa(E,  F )  }. 

Conclude  from  this  the  result  from  Theorem  12.2  that  for  any  game 
Gao(E,  F ),  either  the  first  player  has  a  winning  strategy,  or  the  second 
player  has  a  winning  strategy. 

7.  In  Theorem  12.7,  only  one  of  the  two  processes  need  be  image-finite 
in  order  for  the  conclusion  to  be  true.  Prove  this  by  showing  that  the 
relation 

7Z  =  {  (E,  F)  :  F  is  image-finite  and  E  F  for  all  igN} 

is  a  bisimulation  relation. 

8.  The  trace  set  of  a  state  E  is  defined  as 

T(E )  =  {  s  e  X*  :  E  A  F  for  some  F  }. 

Two  states  E  and  F  are  trace  equivalent,  written  E  =t  F,  if,  and 
only  if,  T(E)  =  T(F).  Finally,  a  state  E  is  deterministic  if,  and 
only  if,  for  all  sei*  there  is  at  most  one  state  F  such  that  E  -4  F. 
That  is,  no  state  that  is  reachable  from  E  has  two  transitions  with  the 
same  label  leading  out  of  it. 

(a)  Prove,  by  induction  on  the  length  of  s,  that  if  E  F  and  E  A 
E'  with  k  =  length(s)  <  n,  then  F  -4  F'  for  some  F'  with 
E'  F'.  Deduce  from  this  that  if  E  ~  F  then  E  =t  F. 

(b)  Prove  that  TZ  =  {(E,F)  :  E  =t  F  and  E,  F  are  deterministic} 
is  a  bisimulation  relation.  Deduce  from  this  that  if  E  —t  F  and 
E  and  F  are  deterministic,  then  E  ~  F. 

9.  A  trace  bisimulation  relation  is  a  binary  relation  TZ  over  states 
which  satisfies  the  following  property  (where  the  extended  transition 
relation  — >C  5  x  A*  x  5  is  defined  in  the  previous  exercise): 

If  ETZF  then 


•  if  E  -A  E1  then  F  F'  for  some  F'  such  that  E'1ZF'\  and 
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•  if  F  A  F'  then  E  ft  E'  for  some  E'  such  that  E'TZF1. 

In  terms  of  the  bisimulation  game,  this  reflects  a  change  in  the  rules 
which  allows  the  first  player  to  make  a  sequence  of  transitions,  rather 
than  a  single  transition,  which  the  second  player  must  copy. 

Prove  that  TZ  is  a  trace  bisimulation  relation  if,  and  only  if,  TZ  is  a 
bisimulation. 

10.  A  set  R  C  E  is  a  refusal  set  of  E  if,  and  only  if,  E  ft  for  any  a  e  R. 
A  pair  ( w ,  R)  E  E*  x  2s  is  a  failure  of  E  if,  and  only  if,  E  ft  F 
for  some  F  such  that  R  is  a  refusal  set  of  F.  E  and  F  are  failures 
equivalent,  written  E  —f  F,  if,  and  only  if,  they  possess  the  same 
failures. 

(a)  Prove  that  E  ~  F  implies  E  =f  F,  and  that  E  =f  F  implies 
E=t  F. 

(b)  Recalling  the  vending  machines  Vi,  V2  and  Vs,  prove  that  Vi  jtf  V2 
but  that  V2  —j  V3,  thus  showing  that  the  reverse  implications  do 
not  hold  in  general. 

(c)  What  is  the  relationship  between  =f  and  x,  the  simulation  equiv¬ 
alence  from  Exercise  12.8? 

11.  Ordinal  numbers,  viewed  as  sets,  can  be  defined  as  follows: 

•  if  5  is  a  set  of  ordinals,  then  so  is  U  5; 

•  if  X  is  an  ordinal,  then  so  is  X+  =  X  U  {  X  }; 

•  nothing  is  an  ordinal  number  unless  it  is  constructed  from  the 
above  two  rules. 

Thus  we  can  construct  the  first  few  ordinals  as  follows: 

• O=U0=0 

•  1  =  O+  =  OU{O}  =  0U{O}  =  {O} 

•  2  =  1+  =  1U{1}  =  {0,  1} 

•  3  =  2+  =  2u{2}  =  {0,  1,  2} 

•  n  =  {  0,  1,  2,  ... ,  n— 1 } 

.  w  =  U{0,  1,  2,  ...}  =  { 0,  1,  2,  ...  } 

•  uj+ 1  =  w+=uu{w}  =  {0,  1,2,  . . . ,  w} 


Intuitively,  an  ordinal  is  the  set  of  ordinals  less  than  it;  and  the  less- 
than  relation  corresponds  to  membership:  X  <  Y  if,  and  only  if, 
X  eY. 
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Prove  the  following  facts  about  ordinal  numbers  X,  Y  and  Z  as  defined 
above. 

(a)  Every  element  of  an  ordinal  X  is  itself  an  ordinal. 

(Proof:  By  induction  on  X.) 

(b)  If  X  e  Y  and  Y  e  Z  then  XeZ;  that  is,  e  is  transitive. 

(Proof:  By  induction  on  Z.) 

(c)  If  X  eY  then  X  CY. 

(Proof:  Follows  directly  from  previous  result.) 

(d)  X  <^X. 

(Proof:  By  induction  on  X.) 

(e)  X  n  Y  is  an  ordinal. 

(Proof:  By  induction  on  X.) 

12.  Prove  Theorem  12.11.  (page  327). 

13.  Prove  Theorem  12.12  (page  327). 

14.  Prove  Theorem  12.13  (page  327). 


Chapter  13 

Logical  Properties  of  Processes 


I  summed  up  all  systems  in  a  phrase,  and  all  existence  in  an  epi¬ 
gram. 

-  Oscar  Wilde. 

Thus  far  in  Part  II  of  the  book,  we  have  developed  the  understanding 
of  what  a  process  is,  namely  a  labelled  transition  system,  as  well  as  the 
means  for  describing  processes  formally  with  a  simple  process  language.  We 
have  also  defined  when  two  processes  are  equivalent  -  namely  when  they  are 
bisimilar,  which  we  presented  as  game  equivalent  -  as  well  as  a  procedure 
for  determining  if  two  processes  are  equivalent. 

Determining  equivalence  between  processes  is  instrumental  for  finding 
out  if  a  proposed  implementation  of  a  computing  system  matches  its  speci¬ 
fication.  However,  we  are  often  not  interested  in  the  complete  behaviour  of 
a  system,  but  rather  only  in  certain  aspects.  For  example,  we  may  not  be 
interested  -  for  the  moment  -  in  what  actions  a  certain  system  does,  but 
rather  we  might  only  want  to  know  if  it  can  ever  deadlock,  that  is,  evolve 
into  a  state  in  which  it  can  perform  no  actions.  This  would  be  very  useful 
in  the  analysis  of  systems  which  are  expected  to  be  perpetual,  such  as  op¬ 
erating  systems  (particularly  those  running  on  critical  systems).  In  other 
instances  we  may  be  interested  only  in  knowing  if  a  given  system  may  or 
will  ever  perform  a  particular  action,  for  example  service  a  request  such  as 
printing  a  document  that  has  been  sent  to  the  printer  queue. 

In  this  chapter  we  shall  consider  a  simple  logic  for  expressing  properties  of 
systems,  as  well  as  the  means  for  determining  whether  or  not  a  given  process 
satisfies  such  properties.  The  properties  which  the  logic  can  express  will  be 
dynamic  (behavioural)  properties  which  describe  what  actions  a  process  can 
or  cannot  do,  rather  than  static  properties  such  as  how  many  states  a  process 
has  which  are  irrelevant  for  its  correct  functioning. 

A  given  property  will  potentially  hold  of  many  different  systems,  and 
fail  to  hold  of  many  others.  However,  the  properties  that  we  express  should 
respect  our  understanding  of  equivalence:  if  a  given  property  holds  of  a  par¬ 
ticular  process,  then  it  should  hold  of  any  other  equivalent  process.  Con- 


F.  Moller,  G.  Struth,  Modelling  Computing  Systems, 

Undergraduate  Topics  in  Computer  Science, 

DOI  10. 1007/978- 1-84800-322-4_14,  ©  Springer- Verlag  London  2013 


334  Logical  Properties  of  Processes 


versely,  if  two  processes  are  not  equivalent,  then  you  should  be  able  to 
express  a  property  which  distinguishes  between  these  two  processes;  that  is, 
a  property  which  holds  of  one  of  the  processes  but  not  the  other.  The  logic 
which  we  describe  in  this  chapter  is  of  this  nature. 

(^Example  13.1?) 

Consider  the  following  two  statements  about  a  particular  computer: 

1.  “The  computer  consists  of  three  parts:  a  CPU,  a  memory  unit,  and  a 
bus  for  communicating  with  the  environment.” 

2.  “CONTROL- ALT-DELETE  can  be  pressed;  this  will  shut  down  the 
computer,  which  will  then  not  do  anything  further.” 

The  first  statement  does  not  refer  to  the  (dynamic)  behaviour  of  the  com¬ 
puter,  but  rather  to  its  (static)  structure.  As  such  it  cannot  be  used  to 
distinguish  between  the  behaviour  of  this  and  any  other  computer.  An¬ 
other  computer  may  (and  likely  will)  consist  of  the  same  three  parts  yet 
behave  completely  differently;  while  yet  another  may  behave  identically  to 
the  computer  in  question  despite  being  built  completely  differently. 

By  contrast,  the  second  statement  describes  one  particular  aspect  of  the 
behaviour  of  the  computer  which  we  may  want  our  computer  to  demonstrate. 
Another  computer  built  from  the  same  three  parts  may  be  unacceptable  if 
it  does  not  behave  the  same  when  you  press  the  CONTROL- ALT-DELETE 
combination  of  keystrokes. 


(l3.l)  The  Mays  and  Musts  of  Processes 

In  trying  to  understand  the  differences  between  the  behaviours  of  the  various 
vending  machines  in  Section  11.4,  we  were  led  to  making  statements  such 
as  the  following  two: 

1.  We  may  do  a  ‘10p’  action  and  end  up  in  a  state  in  which 

we  may  do  a  ‘10p’  action  and  end  up  in  a  state  in  which 
we  cannot  do  a  ‘coffee’  action. 

2.  We  may  do  a  ‘10p’  action  and  end  up  in  a  state  in  which 

no  matter  how  we  do  a  ‘10p’  action 
we  must  end  up  in  a  state  m  which 
we  cannot  do  a  ‘tea’  action. 

Thus  we  are  expressing  capabilities  (and  inabilities)  of  a  process  using  the 
two  auxiliary  verbs  may  and  must,  which  are  known  as  modal  helping 
verbs  as  they  help  set  the  modality  -  necessity  or  possibility  -  of  the  main 
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verb.  In  fact,  we  are  using  these  verbs  in  a  very  strict  manner,  namely  in 
the  following  two  contexts: 

{ a)P :  we  may  do  an  ‘a’  action  and 

end  up  in  a  state  in  which  P  is  true\ 

[a]P:  no  matter  how  we  do  an  ‘a’  action 

we  must  end  up  m  a  state  in  which  P  is  true. 

We  will  use  the  above  notation,  { a)P  (pronounced  “diamond-a”  P)  and  [a]P 
(pronounced  “box-a”  P),  for  writing  down  such  statements. 

How,  then,  can  we  express  the  simple  property  that  we  may  do  a  ‘coffee’ 
action?  It  doesn’t  suffice  to  simply  write: 

(coffee) 

as  -  following  the  translations  given  above  -  this  reads  in  English  as: 

we  may  do  a  ‘coffee’  action  and  end  up  in  a  state  in  which. 

This  is  not  grammatically  correct.  In  order  to  complete  the  sentence,  we 
must  indicate  a  property  that  we  require  to  be  true  of  the  state  into  which 
the  process  evolves  after  doing  the  ‘coffee’  action.  Every  modality  “(a)” 
and  “[a]”  has  to  be  followed  by  some  property  P. 

In  this  case,  however,  we  don’t  require  anything  in  particular  to  be  true 
in  the  state  we  get  into  after  doing  the  ‘coffee’  action;  we  are  only  content 
that  we  can  evolve  into  such  a  state.  To  solve  this  problem,  we  can  use  the 
property  true,  which  of  course  is  itself  true  of  any  process  state.  Thus,  to 
express  the  property  that  we  may  do  a  ‘coffee’  action,  we  would  write: 

(cof  f  ee)true 

which  more  fully  says 

we  may  do  a  ‘coffee’  action  and 
end  up  in  a  state  in  which  true  is  true. 

Although  the  final  clause  is  redundant,  as  true  is  always  true  (that  is,  true  is 
true  in  every  state),  it  is  nonetheless  necessary  in  order  to  turn  the  expression 
into  a  complete  logical  statement. 

We  may  now  express  the  two  properties  of  our  vending  machines: 

1.  (10p)(10p)-i(coffee)true 

2.  (10p)[10p]-i(tea)true 

If  we  read  these  two  lines  as  English  statements  following  the  translations 
given  above  for  the  new  notation  -  as  well  as  reading  the  negation  of  a 
property,  -i P,  as  “it  is  not  the  case  that  P”  -  we  arrive  at  the  following: 
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1.  We  may  do  a  ‘10p’  action  and  end  up  in  a  state  in  which 

we  may  do  a  ‘10p’  action  and  end  up  in  a  state  in  which 
it  is  not  the  case  that 

we  may  do  a  ‘coffee’  action  and 
end  up  in  a  state  in  which  true  is  true. 

2.  We  may  do  a  ‘10p’  action  and  end  up  in  a  state  in  which 

no  matter  how  we  do  a  ‘10p’  action 
we  must  end  up  in  a  state  m  which 
it  is  not  the  case  that 

we  may  do  a  ‘tea’  action  and 

end  up  in  a  state  in  which  true  is  true. 


(^Exercise  13.1^)  (Solution  on  page  477) 

Explain  what  each  of  the  following  properties  expresses. 

1.  (coffee)true 

2.  (cof f ee)false 

3.  [coffeejtrue 

4.  [cof  f  eejfalse 


Exercise  13.2  )  (Solution  on  page  477) 

How  can  we  express  the  property  that  we  cannot  do  two  ‘a’  actions  in  a  row? 
Give  your  answer  using  the  above  notation,  and  write  out  your  property  in 
English. 


Exercise  13.3J  (Solution  on  page  478) _ 

How  can  we  express  a  property  that  distinguishes  between  the  clock  Cl  from 
Example  11.7  which  ticks  forever,  and  the  clock  Cl*  from  Exercise  11.8  which 
may  tick  forever  or  may  stop  ticking  after  any  tick?  That  is,  give  a  property 
using  the  above  notation  which  is  true  of  Cl  but  false  of  Cl*.  Write  out  your 
property  in  English  as  well. 


(l3.2j  A  Modal  Logic  for  Properties 


In  the  previous  section  we  presented  the  core  of  a  simple  logical  language 
for  expressing  properties  which  may  be  true  or  false  of  a  given  process. 
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In  this  section  we  complete  the  description  of  this  simple  logic,  which  we 
shall  simply  call  HML  (for  Hennessy- Milner  Logic,  after  its  inventors).  This 
language  consists  essentially  of  propositional  logic  with  the  additional  two 
modal  connectives  { a)P  (“diamond-a”  P )  and  [a]P  (“box-a”  P ): 

P,  Q  ::=  true  |  false  |  n?  |  |  PvQ  |  (a)P  \  [a]P. 

A  formula  P  of  HML  represents  a  property  which  may  or  may  not  be 
true  in  a  given  state  E  of  a  process.  If  it  is  true  in  that  state,  we  shall  write 
E  \=  P  and  say  that  the  state  E  satisfies  the  property  P;  otherwise  we 
will  write  E\f  P  and  say  that  the  state  E  does  not  satisfy  the  property  P; 
that  is,  by  P  we  mean  -i (E  \=  P).  If  a  property  is  true  in  some  state, 
then  we  say  that  the  property  is  satisfiable\  and  if  it  is  true  in  every  state, 
then  we  say  that  it  is  valid. 

Whether  or  not  a  property  is  true  in  a  given  state  is  defined  inductively 
on  the  structure  of  the  formula  P  as  follows: 

•  E  \=  true  for  all  E. 

The  property  true  is  true  in  every  state. 

•  E\f  false  for  all  E. 

The  property  false  is  not  true  in  any  state. 

•  E  \=  -i P  if,  and  only  if,  E\f  P. 

The  property  -i P  is  true  in  a  state  if,  and  only  if,  P  is  not  true  in  that 
state. 

•  E  \=  P  A  Q  if,  and  only  if,  E  \=  P  and  E  \=  Q. 

The  property  P  A  Q  is  true  in  a  state  if,  and  only  if,  both  P  and  Q  are 
true  in  that  state. 

•  E  \=  P  V  Q  if,  and  only  if,  E  \=  P  or  E  J=  Q. 

The  property  P  V  Q  is  true  in  a  state  if,  and  only  if,  either  P  or  Q  (or 
both)  is  true  in  that  state. 

•  E  \=  (a) P  if,  and  only  if,  F  \=  P  for  some  state  F  such  that  E  A  F. 

The  property  (a)P  is  true  in  a  state  if,  and  only  if,  you  can  do  an  ‘a’ 
transition  from  that  state  to  a  state  in  which  the  property  P  is  true. 

•  E  \=  [a\P  if,  and  only  if,  F  \=  P  for  all  F  such  that  E  -A  F. 

The  property  [a]P  is  true  in  a  state  if,  and  only  if,  the  property  P  is 
true  in  every  state  that  you  can  get  to  by  doing  an  ‘a’  transition  from 
that  state. 

The  syntax  and  semantics  of  the  logic  HML  is  summarised  in  Figure  13.1. 

We  shall  make  use  of  the  following  shorthand  abbreviations: 
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E 

\= 

true 

for  all 

E. 

E 

1= 

false 

for  no 

E. 

E 

h 

P 

if,  and 

only 

if,  E&P. 

E 

h 

PAQ 

if,  and 

only 

if,  E  \=  P 

and  E  \=  Q. 

E 

h 

Pv  Q 

if,  and 

only 

if,  E  \=  P 

or  E  \=Q. 

E 

t= 

(a)P 

if,  and 

only 

if,  F  \=  P 

for  some  F  such  that 

E  A  F. 

E 

N 

[a]P 

if,  and 

only 

if,  p  h  p 

for  all  F  such  that  E 

A  F. 

Figure  13.1:  Syntax 

and  semantics  of  the  modal  logic 

HML. 

(-)P  =  \J(a)P  where  the  disjunction  ranges  over  the  whole  set  of  actions 

a 

of  a  process. 

This  property  is  true  in  a  state  if,  and  only  if,  you  can  do  some  tran¬ 
sition  from  that  state  to  a  state  in  which  the  property  P  is  true. 

[—]P  =  /\[a]P  where  the  conjunction  ranges  over  the  whole  set  of  actions 

a 

of  a  process. 

This  property  is  true  in  a  state  if,  and  only  if,  the  property  P  is  true  in 
every  state  that  you  can  get  to  by  doing  a  transition  from  that  state. 

{ K)P  =  \J  (a) P  where  K  is  a  set  of  actions  (typically  written  without 
a€K  set  braces,  as  in  (a,  b,  c)P). 

This  property  is  true  in  a  state  if,  and  only  if,  you  can  do  an  ‘a’ 
transition  from  that  state,  for  some  a  6  If,  to  a  state  in  which  the 
property  P  is  true.  This  is  the  same  as  (a)P  when  if  =  {a};  the  same 
as  (-)P  when  if  is  the  set  of  all  actions  of  a  process;  and  the  same  as 
false  when  K  =  0. 

[K]P  =  f\  [a]P  where  if  is  a  set  of  actions  (typically  written  without 
a€K  set  braces,  as  in  [a,  b,  c]P). 

This  property  is  true  in  a  state  if,  and  only  if,  the  property  P  is  true 
in  every  state  that  you  can  get  to  by  doing  an  ‘a’  transition  from  that 
state,  for  some  a  e  K.  This  is  the  same  as  [a]P  when  if  =  {a};  the 
same  as  [— ]P  when  if  is  the  set  of  all  actions  of  a  process;  and  the 
same  as  true  when  K  =  0. 

(-K)P  =  ( K)P  where  if  is  a  set  of  actions  (typically  written  without 
set  braces,  as  in  (—a,  b,  c)P). 
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This  property  is  true  in  a  state  if,  and  only  if,  you  can  do  an  ‘a’ 
transition  from  that  state,  for  some  a  ^  K  (i.e. ,  for  some  a  E  K),  to  a 
state  in  which  the  property  P  is  true. 

[— K]P  =  [K]P  where  K  is  a  set  of  actions  (typically  written  without 
set  braces,  as  in  [—a,  b,  c\P). 

This  property  is  true  in  a  state  if,  and  only  if,  the  property  P  is  true 
in  every  state  that  you  can  get  to  by  doing  an  ‘a’  transition  from  that 
state,  for  some  a  £  K  (i.e.,  for  some  a  e  K ). 

Note  that  in  all  of  the  above  shorthand  formulas  we  assume  that  the  number 
of  possible  actions  is  finite;  the  logic  HML  does  not  have  infinite  conjunction 
or  disjunction. 

^Example  13.3^) _ 

Consider  the  following  two  simple  processes: 

E  =f  a. a. 0  F  =f  a. a.  0  +  a.O 


These  differ  in  that  process  F  may  deadlock  immediately  after  performing 
the  first  ‘a’  action,  whereas  process  E  is  guaranteed  to  be  able  to  perform  a 
second  ‘a’  action  after  performing  the  first  ‘a’  action.  These  can  be  rendered 
in  modal  logic  as  follows: 

•  F  \=  {a)-i{a)true  whereas  E  (a)-i(a) true. 

We  may  do  an  ‘a’  action  and  end  up  m  a  state  in  which 
we  cannot  do  another  ‘a’  action. 

This  is  true  in  state  F  but  not  true  in  state  E. 

•  E  \=  [a]{a)true  whereas  F  [a]{a)true. 

No  matter  how  we  do  an  ‘a’  action, 
we  must  end  up  in  a  state  in  which 
we  may  do  another  ‘a’  action. 

This  is  true  in  state  E  but  not  true  in  state  F. 

When  first  learning  to  think  logically  with  the  modal  verbs  “may"  and 
“must,"  it  is  easy  to  misinterpret  properties,  particularly  when  expressing 
them  in  the  language  of  HML.  A  common  mistake  arises  when  wanting  to 
express  the  property 
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I  must  do  an  ‘a  ’  action. 

This  property  is  not  captured  by  the  formula  {a)true  which  expresses  the 
property 

I  may  do  an  ‘a  ’  action 

as  this  allows  the  possibility  of  doing  something  other  than  an  ‘a’  action; 
if,  for  example,  I  could  also  do  a  ‘b’  action,  then  it  would  clearly  not  be  the 
case  that  I  must  do  an  ‘a’  action. 

The  next  misconception  is  that  -  being  a  "must"  property  -  we  would 
express  the  desired  property  (that  an  ‘a’  action  must  happen)  as  [ajtrue. 
However,  this  formula  only  expresses  what  must  be  true  if  and  when  you 
do  an  ‘a’  action ;  it  doesn’t  even  assert  that  an  ‘a’  action  is  even  possible! 
More  precisely,  it  asserts  that: 

no  matter  how  we  do  an  ‘a’  action 

we  must  end  up  in  a  state  in  which  true  is  true 

which  is  true  of  every  state  of  a  system  whether  or  not  it  can  do  an  ‘a’ 
action! 

So  how  then  do  we  express  the  property  that  an  ‘a’  action  must  happen? 
The  answer  is:  precisely  when  an  ‘a’  action  may  happen  and  no  other 
action  may  happen,  which  we  can  express  in  HML  as  follows: 

(a)true  A  f\  ^(6)true 
b^a 

or  more  simply  using  our  shorthand  as  follows: 

(a)true  A  -i(— a)true 


(^Exercise  13.4^)  (Solution  on  page  478) 

Consider  the  following  transition  system: 


Which  of  the  following  are  correct? 


1. 

E  |=  (a)true 

5.  E  \=  (a)  (a) true 

9 .  E\= 

[a](a)true 

2. 

E  |=  (6)true 

6.  E  \=  (a)(6) true 

10.  E 

[a]  (6)  true 

3. 

E  \=  [ajfalse 

7.  E  \=  (a) [ajfalse 

11.  E  |= 

[a]  [ajfalse 

4. 

E  \=  [6] false 

8.  E  \=  (a)[6]false 

12.  E  \= 

[a]  [6]  false 
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(^Exercise  13.5^)  (Solution  on  page  478) 

Express  the  following  properties  regarding  the  lamp  process  from  Section  11.2 
pictured  in  Figure  11.5  (page  289).  In  each  case,  indicate  which  of  the  three 
states  of  the  process  (Off,  On,  Broken)  satisfy  the  property  in  question. 

1.  I  may  do  two  ‘pull’  actions  in  a  row  followed  by  a  ‘break’  action. 

2.  I  may  do  two  ‘pull’  actions  in  a  row  followed  by  a  ‘reset’  action. 

3.  I  cannot  do  a  ‘pull’  action. 

4.  I  can  only  do  a  ‘pull’  action  (that  is,  I  must  do  a  ‘pull’  action). 


(l3.3)  Negation  Is  Definable 

In  Section  11.4  we  observed  that  the  negation  of  a  must  property  is  equiva¬ 
lent  to  a  may  property,  and  vice  versa.  This  should  have  become  apparent 
as  well  from  Exercise  13.4. 

More  precisely,  we  consider  two  formulae  of  our  modal  logic  to  be  equiv¬ 
alent  if,  and  only  if,  they  are  true  in  the  same  states:  P  o  Q  if,  and  only 
if,  for  all  states  E:  E  \=  P  o  E  \=  Q.  Our  observations  about  negating 
modal  properties  are  then  expressed  as  follows: 

—i{a)P  4=>  [a]^P  and 

^[a]P  o  (a)^P 

In  words  these  say  the  following:  the  property  which  states  that 

we  cannot  do  an  ‘a’  action  and 
end  up  m  a  state  m  which  P  is  true 

is  equivalent  to 

no  matter  how  we  do  an  ‘a'  action 

we  must  end  up  in  a  state  in  which  -1 P  is  true ; 

and  the  property  which  states  that 

it  is  not  true  that  no  matter  how  we  do  an  ‘a  ’  action 
we  must  end  up  in  a  state  in  which  P  is  true 

is  equivalent  to 

we  may  do  an  ‘a’  action  and 

end  up  in  a  state  in  which  -1 P  is  true. 
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We  can  motivate  this  correspondence  by  expressing  the  meaning  of  the 
modal  connectives  in  predicate  logic.  A  may  property  says  something  about 
some  state  to  which  you  can  go,  whereas  a  must  property  says  something 
about  all  states  to  which  you  can  go: 

E  |=  ( a)P  if,  and  only  if,  3P  (  E  A  F  A  F  \=  P ) 

E\=\a]P  if,  and  only  if,  \/F{EAF  =>  F  |=  P ) 

We  can  then  reason  about  these  properties  using  the  rules  for  quantification 
from  Section  4.3: 

-Nx  P(x)  o  3 x^P(x)  and  SxP(x)  o  \/x^P(x). 

For  example,  we  can  show  the  equivalence  ~<{a)P  o  [a]-iP  as  follows: 

E  \=  ->{a)P  O  E  ^  (a)P 

O  ^3 F(E  A  F  A  F  \=  P) 

O  F  A  F  \=  P) 

O  MF(E  A  F  =L  F^P) 

O  VF{E  A  F  =4> 

Or  E  |=  [a]-iP 


Exercise  13.6  j  (Solution  on  page  478) 


Show  the  equivalence  ->[a]P  O  (a)-iP  by  using  the  rules  for  quantification 
to  prove  that  S  |=  n[a]P  «F|=  {a)^P. 


In  order  to  express  that  we  cannot  do  an  ‘a’  action,  we  can  write 


-i(o)true  (It  is  not  the  case  that  we  can  do  an  ‘a’  action.) 

By  the  above  observation,  since  ^true  =  false  this  is  equivalent  to  the  ex¬ 
pression 

[ajfalse  (No  matter  how  we  do  an  ‘a’  action 

we  must  end  up  in  a  state  in  which  false  is  true.) 

Since  false  cannot  be  true  in  any  state,  this  means  that  there  must  be  no 
possibility  of  doing  an  ‘a’  action,  as  if  we  could  do  an  ‘a’  action  we  would 
have  to  end  up  in  a  state  in  which  false  is  true. 

Although  HML  includes  negation,  we  can  show  that  any  property  that 
can  be  expressed  in  HML  can  be  expressed  without  using  negation.  That 
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is,  any  formula  P  of  HML  can  be  transformed  into  a  formula  pos(P)  which 
contains  no  negation  symbol  and  is  semantically  equivalent  to  P  in  the  sense 
that  E  |=  pos(P)  if,  and  only  if,  E  \=  P.  This  transformation  is  defined 
together  with  a  dual  transformation  neg(P)  which  transforms  the  formula 
P  into  one  which  contains  no  negation  symbols  yet  is  semantically  equivalent 
to  -i P  in  that  E  \=  neg(P)  if,  and  only  if,  E\£  P.  Both  transformations 
involve  pushing  negations  into  formulas  using  De  Morgan’s  Laws,  and  are 
defined  inductively  on  the  structure  of  the  formula  P  as  follows: 


pos(true)  =  true 
pos(false)  =  false 
pos  (-iP)  =  neg(P) 
pos(P  A  Q)  =  pos (P)  A  pos(Q) 
pos(P  V  Q)  =  pos (P)  V  pos(Q) 
pos  {(a)P)  =  (a)pos(P) 
pos([a]P)  =  [a]pos(P) 


neg(true)  =  false 
neg(false)  =  true 
neg(^P)  =  pos(P) 
neg(P  A  Q)  =  neg(P)  V  neg(Q) 
neg(P  V  Q)  =  neg(P)  A  neg(Q) 
neg({a)P)  =  [a]neg(P) 
neg([a]P)  =  (a)neg(P) 


It  is  immediately  clear  that  pos(P)  and  neg(P)  are  negation-free  terms, 
as  negation  does  not  appear  on  the  right-hand  side  of  any  of  the  defining 
equations. 


(^Theorem  13.6^) 

For  any  process  E  and  any  formula  P  of  HML. 


1.  E  J=  pos(P)  if,  and  only  if,  E  |=  P;  and 

2.  E  \=  neg(P)  if,  and  only  if,  E\£  P. 


Proof:  By  induction  on  the  structure  of  P.  The  details  are  left  as  an 
exercise. 


(^Exercise  13.7J  (Solution  on  page  478) 
Prove  Theorem  13.6 


(^Exercise  13.8)  (Solution  on  page  482) 


Prove,  by  induction  on  the  structure  of  P,  that  neg(neg(P))  =  P. 
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(l3.4)  The  Vending  Machines  Revisited 

We  can  now  express  precisely  the  differences  between  the  three  vending 
machines  Vi,  V2  and  V3  introduced  in  Section  11.4,  by  writing  down  formulae 
of  the  logic  HML  which  distinguish  between  them.  Specifically,  we  shall 
produce  three  formulae  Pu  P2  and  P3  of  HML  such  that  Vt  |=  Pt  for  each  i, 
but  Vt  Pj  whenever  i  j.  That  is,  formula  Pt  will  distinguish  the 
machine  V,  from  the  other  two  machines  by  expressing  a  property  which  is 
true  of  machine  Vl  but  not  true  of  the  others. 

1.  Pi  =  [10p][10p]{tea)tme 

This  formula  expresses  the  property  that  after  doing  two  consecutive 
‘  10p’  actions,  we  must  be  in  a  state  in  which  we  can  do  a  tea  move. 
This  is  true  of  Vi  as  there  is  only  one  state  in  which  we  can  be  after 
doing  the  two  ‘  10p’  actions,  namely  the  state 

cof fee .  collect .  V\  +  tea.  collect .  Vi 

and  it  is  certainly  the  case  that  we  may  do  a  tea  move  from  this  state. 

However,  this  is  neither  true  of  V2  nor  of  V3;  in  both  of  these  cases  it 
is  possible  to  do  two  consecutive  ‘  10p’  actions  and  end  up  in  a  state 
where  a  ‘tea’  action  is  not  possible  (only  a  ‘coffee’  action).  That  is, 
V2  and  V3  satisfy  the  formula 

P[  =  (10p){10p)[tea]false 

while  Vi  does  not.  (This  formula  is  the  negation  of  the  one  in  question.) 

2.  P2  =  [I0p](10p)[tea]false 

This  formula  expresses  the  property  that  after  doing  a  ‘  10p’  action,  we 
will  be  able  to  do  a  further  ‘  10p’  action  and  end  up  in  a  state  where 
we  cannot  do  a  ‘tea’  action.  This  is  true  of  V2  as  there  is  only  one 
state  in  which  we  can  be  after  doing  the  first  ‘  10p’  action,  namely  the 
state 


lOp . cof fee . collect . Vi  +  lOp. tea. collect . Vi 
and  we  can  indeed  do  a  further  ‘  10p’  action,  getting  to  the  state 
coffee . collect .  V) 
in  which  we  cannot  do  a  ‘tea’  action. 

However,  this  is  neither  true  of  Vi  nor  of  Vi;  in  these  cases  it  is  possible 
to  do  the  following  ‘  10p’  actions: 

lOp 

•  Vi  — ^  lOp. (coffee . collect . Vi  +  tea. collect . Vi) 
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lOp 

•  V3  — ->  lOp. tea. collect . V3 

In  both  cases  we  end  up  in  a  state  from  which,  after  doing  a  further 
‘  10p’  action,  we  can  do  a  ‘tea’  action.  Vi  and  V2  thus  satisfy  the 
formula 

P2  =  (10p)[10p]  (tea)  true 

while  V2  does  not.  (This  formula  is  the  negation  of  the  one  in  question.) 
3.  P3  =  (10p)[10p][tea]false 

This  formula  expresses  the  property  that  it  is  possible  to  do  a  ‘  10p’ 
action  and  end  up  in  a  state  from  which  we  cannot  do  a  further  ‘  10p’ 
action  followed  by  a  ‘tea’  action.  This  is  true  of  V2  as  we  can  make 
the  transition 

lOp 

V3  — ->  lOp.  coffee .  collect .  V3 

and  indeed  find  ourselves  in  a  state  from  which  we  cannot  do  a  further 
‘  10p’  action  followed  by  a  ‘tea’  action. 

However,  this  is  neither  true  of  V)  nor  of  V2;  in  each  of  these  cases 
there  is  only  one  lOp  transition  possible,  namely: 

lOp 

•  V\  — A  lOp.  (coffee .  collect .  Vi  +  tea.  collect .  Vi) 

lOp 

•  Vi  — ->  lOp.  coffee .  collect .  Vi  +  lOp  .tea.  collect .  Vi 

In  both  cases  we  end  up  in  a  state  from  which  we  can  do  a  further  ‘  10p’ 
action  followed  by  a  ‘tea’  action.  V)  and  Vi  thus  satisfy  the  formula 

P2  —  [10p](10p)(tea)true 

while  Vi  does  not.  (This  formula  is  the  negation  of  the  one  in  question.) 


(^Exercise  13.9)  (Solution  on  page  482) 

Recall  the  following  processes  from  Exercise  11.16. 

A  =  b.c.O  +  b.d.O  C  =  a.B  +  a. A 

B  =  A  +  b.(c.  0  +  d.  0)  D  =  a.B 

Give  two  formulae  of  HML  which  distinguish  between  C  and  D :  one  formula 
which  is  true  of  C  but  not  true  of  D\  and  one  formula  which  is  true  of  D 
but  not  true  of  C. 
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13.5)  Modal  Properties  and  Bisimulation 

We  have  now  developed  two  methods  for  distinguishing  between  processes. 

1.  In  Chapter  12  we  explicitly  defined  what  it  means  for  two  processes  to 
be  equivalent,  in  terms  of  winning  strategies  in  games. 

2.  In  this  chapter  we  defined  a  modal  logic  for  expressing  properties  of 
processes  with  which  we  can  distinguish  between  two  processes. 

We  may  well  wonder  if  these  two  methods  give  the  same  results. 

1.  We  should  be  disturbed  if  two  equivalent  processes  could  be  differen¬ 
tiated  by  some  formula  of  the  modal  logic.  This  would  question  the 
usefulness  of  the  logic  as  a  tool  for  reasoning  about  the  behaviour  of 
processes. 

2.  It  would  also  be  disappointing,  though  less  of  a  concern,  if  the  modal 
logic  could  not  distinguish  between  some  pair  of  non-equivalent  pro¬ 
cesses.  This  would  mean  simply  that  the  logic  is  too  weak  to  express 
all  aspects  of  the  behaviour  of  a  process. 

However,  we  devised  the  equivalence  based  on  a  consideration  of  the  ca¬ 
pabilities  of  the  processes  as  expressed  using  precisely  the  types  of  modal 
verbs  which  form  the  basis  of  our  logic  HML.  Hence  our  intuition  suggests 
that  the  distinguishing  power  of  the  modal  logic  should  coincide  with  the 
equivalence.  In  this  section  we  explore  and  confirm  this  intuition. 

To  determine  if  two  processes  are  n-game  equivalent,  we  need  to  explore 
only  the  first  n  transitions  of  the  processes;  the  behaviour  of  the  processes 
after  n  transitions  is  irrelevant.  In  the  same  way,  in  order  to  determine 
whether  or  not  some  formula  of  the  modal  logic  is  true  of  some  process,  we 
need  only  to  explore  the  initial  behaviour  of  the  process;  exactly  how  deeply 
we  need  explore  the  process  depends  on  the  complexity  of  the  formula,  as 
defined  by  its  modal  depth. 


The  modal  depth  md  ( P )  of  a  formula  P  of  HML  is  defined  inductively  as 
follows. 

md  (true)  =  0  md  (P  A  Q)  =  ma x(md  (P) ,  md(Q)) 

md  (false)  =  0  md(PwQ)  =  ma x(md(P),  md(Q)) 

md  (~<P)  =  md  (P)  md  ({ a)P )  =  1  +  md  (P) 

md([a\P)  =  1  +  md  (P) 

The  modal  depth  simply  counts  the  maximum  number  of  modal  oper¬ 
ators  along  any  path  in  the  syntax  tree  of  an  HML  formula.  For  example, 
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the  formula  (a)([6]false  A  [a](6)true)  has  a  modal  depth  of  3,  as  evidenced 
by  the  following  syntax  tree  for  the  formula. 

md  ({a)([6]false  A  [a](6)true)) 

=  1  +  md  ([6]false  A  [a](6)true) 

—  1  +  ma x(md  ([ftjfalse) , 

md  ([a](6)true)) 

=  l  +  max(l,2) 

=  1  +  2 
=  3 

The  following  theorem  demonstrates  that  no  formula  of  modal  depth  n 
can  distinguish  between  two  processes  which  are  n-game  equivalent.  The 
immediate  corollary  to  this  is  our  first  desired  result:  that  we  cannot  use 
the  logic  to  distinguish  between  equivalent  processes. 


false  { b ) 


true 


(^Theorem  13.9^) 


If  E  J=  P  and  E  F  where  n=md(P),  then  F  J=  P.  That  is,  no 
formula  of  modal  depth  n  can  distinguish  between  two  n-game  equivalent 
processes. 


Proof:  By  induction  on  the  structure  of  P,  arguing  by  cases  on  the  struc¬ 
ture  of  P: 

P  =  true:  The  result  is  immediately  true  in  this  case,  as  the  conclusion 
F  |=  true  is  always  true. 


P  —  false:  The  result  is  vacuously  true  in  this  case,  as  the  premise  E  j=  false 
is  false. 


P  =  -i O :  Since  E  |=  -i Q,  we  have  E\fi  Q,  and  hence  F  ft  Q  by  induction, 
so  F  |=  -i Q • 


P  =  Qi  A  Oo:  Note  that  nt=  md  (Qj)  <  n  and  n2=  md  ( Q2 )  <  n;  hence 
E  ~ni  F  and  E  ~„2  F. 

Since  E  \=  Qi  A  Q2,  we  have  that  E  \=  Q1  and  that  E  |=  Q2.  Hence  by 
the  induction  hypothesis  (applied  twice),  we  have  that  F  |=  Qi  and 
that  F  \=  Q2,  and  thus  that  F  \=  Qi  /\  Q2. 


P  =  Qi  V  Oo\  Note  that  nj=  md  (Qj)  <  n  and  n2=  md  ( Q2 )  <  n;  hence 
E  ~nj  F  and  E  ~„2  F. 
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Since  E  \=  Qi  V  Q2,  we  have  that  E  \=  Q1  or  that  E  \=  Q2.  Hence 
by  the  induction  hypothesis  (applied  twice),  we  have  that  F  |=  Q-i  or 
that  F  \=  Q2,  and  thus  that  F  \=  Qi  V  Q2. 


P  =  (a)Q:  Note  first  that  n=  md  ( P )  >  0,  and  md  ( Q )  =  n—  1. 

Since  E  |=  ( a)Q ,  we  have  that  E  A  E'  for  some  E'  such  that  E'  |=  Q. 
But  then,  since  E  F,  we  must  have  that  F  -A  F1  for  some  F'  such 
that  E'  ~„_i  F1.  Hence  by  the  induction  hypothesis,  we  have  that 
F'  |=  Q,  and  thus  that  F  |=  ( a)Q  as  required. 


P  =  \a}0:  Note  first  that  n=  md  ( P )  >  0,  and  md  ( Q )  =  n—  1. 

To  show  that  F  |=  [a]Q,  we  need  to  show  that  F1  \=  Q  whenever 
F  A  F' . 

Suppose  then  that  F  A  F' .  Since  E  F  we  must  have  that  E  A  E' 
for  some  E'  such  that  E'  ~„_i  F'\  furthermore,  since  E  |=  [a]Q,  we 
must  have  that  E'  \=  Q.  Thus  by  the  induction  hypothesis,  we  must 
have  that  F1  |=  Q  as  required.  □ 


We  can  express  this  result  more  succinctly  if  we  first  formulate  the  notion 
of  logical  equivalence  with  respect  to  the  formulae  of  H  M  L  of  a  fixed  bounded 
modal  depth. 

(^Definition  13.1(f) _ 

Let 


HML„  =  {Pe  HML  :  md  (P)  <  n} 
be  the  subset  of  HML  consisting  of  all  formulas  of  modal  depth  at  most  n. 

1.  Two  processes  E  and  F  are  n-logically  equivalent,  written  E  =„  F, 
if,  and  only  if,  the  following  holds. 

For  all  P  e  HML„:  E  \=  P  if,  and  only  if,  F  \=  P. 

That  is,  no  formula  of  modal  depth  n  (or  less)  can  distinguish  between 
them. 

2.  The  processes  E  and  F  are  logically  equivalent,  written  E  =  F  if, 
and  only  if,  the  following  holds. 

for  all  P  e  HML:  E  \=  P  if,  and  only  if,  F  \=  P. 

That  is,  no  formula  (of  any  modal  depth)  can  distinguish  between 
them. 
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Theorem  13.9  then  states  simply  that  E  =n  F  whenever  E  F. 

(^Corollary  13.10J _ 

If  E  ~  F  then  E  =  F,  that  is,  no  formula  of  the  logic  HML  can  distinguish 
between  two  equivalent  processes  E  and  F. 


Proof:  If  E  and  F  could  be  differentiated  by  a  formula  P  of  HML,  then 
by  the  above  we  would  have  that  E  ftn  F,  where  n=  md  ( P ),  and  hence  we 
would  have  that  E  ft  F.  □ 

The  converse  result,  that  two  processes  which  cannot  be  distinguished  by 
any  property  of  HML  must  be  equivalent,  is  not  completely  attainable.  This 
is  due  to  the  fact  that  equivalence  is  not  the  limit  of  the  n-game  equivalences, 
while  the  logic  HML  is  the  limit  of  the  bounded  logics  HML„.  However, 
as  was  the  case  with  relating  the  finite-game  equivalences  to  bisimulation 
equivalence,  this  result  holds  when  we  restrict  ourselves  to  image-finite  pro¬ 
cesses. 

(^Theorem  13. IQ) 

For  image-finite  processes  E  and  F,  if  E  =„  F  then  E  F. 


Proof:  We  shall  prove,  by  induction  on  n,  the  equivalent  contrapositive 
statement  that  if  E  ftn  F  then  there  is  a  formula  P  of  modal  depth  n  such 
that  E  ft  P  but  F  ft  P. 

The  base  case  (n=0)  is  vacuously  true,  as  the  premise  E  ft0  F  cannot 
hold. 

For  the  induction  step,  suppose  that  E  ftnih%  F,  and  assume  (without 
loss  of  generality)  that  E  A  E1  for  some  E'  such  that  E'  ftn  F'  whenever 
F  A  F'.  Let 


fAl?!  f  A  f2  ■■■  F  A  Fk 

be  all  of  the  (finitely-many)  a-transitions  possible  from  F.  Then  E'  ft „  Ft 
for  each  i  —  1,2,...,  A:,  and  hence  by  the  induction  hypothesis  there  are 
properties  Plt  P2, . . . ,  Pk  of  modal  depth  n  such  that  E'  ft  Pt  but  Ft  ft  P, 
for  each  i  =  1,  2, . . . ,  k.  The  property  P  we  seek  is  then 

P  =  (a) (Pi  A  P2  A  ■  ••  A  P^. 

Clearly  E  ft  (a) (Pi  A  P2  A  •  •  •  A  Pk)  but  Fft  (a)  (Pi  A  P2  A  •  •  •  A  Pk).  □ 
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Corollary  13.11 


For  ima.ge-6n.ite  processes  E  and  F,  if  E  =  F  then  E  ~  F . 


Proof:  If  E  =  F  then  E  =n  F  for  all  n,  and  hence  by  the  above,  E  ~n  F 
for  all  n.  Thus,  since  E  and  F  are  image-finite,  E  ~  F.  □ 

The  clock  processes  Clock  and  Clock*  from  Example  11.11  pictured  in 
Figure  11.7  (page  298)  provide  the  counter-example  to  this  Corollary  in 
the  case  of  infinite-branching  processes.  In  this  case,  Clock  Clock* 
for  all  n  G  N,  and  hence  Clock  =„  Clock*  for  all  n  G  N,  meaning  that 
Clock  =  Clock*;  however,  Clock  ^  Clock*. 


~k  (13.6)  Characteristic  Formulae 

Given  a  process  state  E,  a  formula  cf(E )  of  the  logic  HML  is  called  a 
characteristic  formula  for  E  if,  and  only  if,  for  all  processes  F: 

F  \=  cf(E )  if,  and  only  if,  F  ~  E. 

For  example,  the  characteristic  formula  for  0  is 

c/(0)  =  [—(false 

as  this  formula  specifies  that  there  are  no  transitions  possible  from  the  state 
in  question. 

(^Exercise  13.ll)  (Solution  on  page  483) 

1.  Argue  that  the  characteristic  formula  for  a.  0  is 

c/(a. 0)  =  (a)true  A  [— ajfalse  A  [—][— (false. 

2.  Give  a  characteristic  formula  for  a. (6.0  +  c.0). 


The  existence  of  characteristic  formulae  further  cements  the  close  con¬ 
nection  between  modal  properties  and  bisimilarity.  However,  the  exact  rela¬ 
tionship  as  presented  in  the  following  theorem  takes  into  account  the  finite 
limitation  of  modal  formulae:  that  they  can  reason  only  about  the  first  steps 
of  a  process  up  to  a  number  of  steps  equal  to  the  modal  depth  of  the  formula. 

(^Theorem  13.  ll) 

For  every  ne N  and  every  state  E  of  an  LTS  defined  over  a  finite  set  of 
actions,  there  is  a  formula  cfn(E )  e  HML„  such  that,  for  all  states  F, 
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F  A  cfn(E)  iff  E  F. 

Furthermore,  for  every  raG N  there  are  only  finitely-many  such  formulas 
cfn(E). 


Proof:  By  induction  on  n. 

For  the  base  case  we  can  take  cfn(E )  =  true,  since  F  |=  true  and  E  ~0  F 
for  every  F  G  States.  Clearly  there  are  only  finitely-many  (namely,  one) 
such  formulas. 

For  the  induction  step,  let 

cfn+1(E)  =  A  {  (a)  cfn(E')  :  E  A  E1} 

A  A  [«]  V  {cfn(E')  :  E  A  E'}. 

a£A 

There  are  two  parts  to  this  formula: 

•  The  first  conjunction  of  subformulae  characterises  what  transition  are 
possible:  for  each  transition  E  A  E',  it  must  be  possible  to  do  an  a 
transition  into  a  state  characterised  by  the  formula  cfn(E'). 

•  The  second  conjunction  of  subformulae  characterises  the  states  into 
which  such  a  transition  must  evolve:  upon  performing  an  a  transi¬ 
tion,  the  process  must  evolve  into  a  state  characterised  by  the  formula 
cfn(E')  for  some  E1  such  that  E  A  E'. 

Recalling  the  assumption  that  A  is  finite,  we  can  note  that  -  even  though 
there  may  be  infinitely-many  transitions  E  A  E'  -  the  two  sets  of  subfor¬ 
mulae  are,  by  induction,  finite;  hence,  this  is  a  well-formed  formula  (ie,  it  is 
of  finite  size),  and  there  can  only  be  finitely-many  such  formulae. 

Suppose  that  F  \=  cfn+1(E). 

•  If  E  A  E'  then,  since  F  \=  (a)  cfn(E'), 

F  A  F'  for  some  F1  such  that  F'  |=  cfn(E'), 
and  thus  by  induction  E'  ~n  F'. 

•  If  F  A  F'  then,  since  f  ^  [a]  \J  cfn(E'), 

eAe' 

E  A  E'  for  some  E'  such  that  F'  A  cfn{E') , 
and  thus  by  induction  E'  F'. 

Hence,  by  the  above  Lemma,  E  ~n+i  F. 

Now  suppose  that  E  ~n+i  F. 

•  If  E  A  E'  then,  by  the  above  Lemma,  F  A  F'  for  some  F'  such  that 
E1  F',  and  thus  by  induction  F'  A  cfn(E').  As  this  is  true  of  all 
a  e  A  and  all  E'  such  that  E  A  E', 
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l|true|| 

=  States 

|jfalse|| 

=  0 

Ib-PII 

=  m 

IIPAQII 

=  -P  n  <3 

ll-P  V  Q\\ 

=  P  u  <3 

IIW-PII 

—  {E  E  States 

E  -b  E'  for  some  E'  E  |]P||  } 

\MP\\ 

—  {E  E  States 

EAE1  implies  E'  E  |jP||  } 

Figure  13.2:  Global  semantics  of  the  modal  logic  HML.  | 

F  N  A  WcME1) 

E— >S' 

•  If  F  -A  F'  then,  by  the  above  Lemma,  E  -A  E'  for  some  E'  such  that 
E1  F',  and  thus  by  induction  F'  \=  cfn(E').  As  this  is  true  of  all 
a  E  A  and  all  F'  such  that  F  -b  F' , 

F  f-  A  A  V  cfn(E') 

a€A  eAe> 

Hence  F  |=  cfn+1(E).  □ 


(l3.7)  Global  Semantics 

An  alternative  way  to  define  the  semantics  of  properties  of  H ML  is  by  asso¬ 
ciating  to  each  property  P  E  HML  the  set  ||P|  of  states  which  satisfy  the 
property  P.  Determining  whether  or  not  E  \=  P  then  would  correspond  to 
determining  if  E  E  ||P||. 

An  inductive  definition  of  the  semantic  function  ||P||  is  given  in  Fig¬ 
ure  13.2,  where  the  set  States  represents  the  set  of  all  states  of  the  underlying 
transition  system.  With  this  definition,  we  get  the  following  result. 

(^Theorem  13.12^) 

E  \=  P  if,  and  only  if,  E  E  ||P||. 


Proof:  By  induction  on  the  structure  of  P,  arguing  by  cases  on  the  struc¬ 
ture  of  P. 

P  =  true:  E  |=  true  -o  E  E  States  o  E  E  ||true| 
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P  =  false:  E  |=  false  o  E  e  0  o  E  E  | |false| | 

P  =  nP:  P|=^P  o  E\£P  o  P£||P|]  «  £e|P|  o  EE  ||-,P| 

P  =  Qi  A  Q?:  E  |=  Qi  A  Q2  O  E  |=  Qi  and  E  |=  Q2 

o  E  e  llQil!  n  |Q2||  o  e  e  :  Q,  a  q2. 


P  =  Q  1  v  Qi>:  P  |=  <3i  v  Q2  O  P  |=  <3i  or  P  |=  Q2 

o  P  e  llQil!  u  HQall  «  P  e  JIQi  v  q2\\ 

P  =  ( a)Q :  P  1=  <a)Q  o  P  A  P'  such  that  P'  |=  Q 

o  P  A  P'  such  that  P'  e  |!Q||  o  Pe||a)Q| 


P  =  lalfi: 


P  |=  fa]  <3  O 


P4P1  implies  P'  |=  Q 
EAE1  implies  P'  £  ||Q| 


O  PG||[a]Q| 


□ 


(^Exercise  13.12^)  (Solution  on  page  483) 

Consider  the  following  transition  system: 


Compute  the  following  sets: 

1.  ||{a)true|j  3.  ||(a)(a)true||  5.  ||{a)[a]false| 

2.  ||:{6)true||  4.  ||(6){6)true||  6.  ||[6](a)true|| 


(13.8)  Additional  Exercises 

1.  Give  properties  of  the  modal  logic  HML  which  distinguish  between  the 
clocks  Cl„  of  Example  11.7.  That  is,  for  each  n  e  N,  give  a  formula 
P„  of  HML  which  is  true  of  Cl„  but  false  of  Clk  for  every  k  yf  n. 

2.  What  does  the  property  (a)false  say?  Can  you  give  an  example  process 
which  satisfies  this  property? 
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3.  Express  the  negation  of  each  of  the  following  properties  without  using 
negation  operator  -c  In  each  case,  write  out  each  property  and  its 
negation  in  English. 

(a)  [a]({6)true  A  (c)true). 

(b)  [a] (h)({a)true  V  (6)[a]false). 

4.  Consider  the  following  4-state  transition  system. 


Fill  in  the  following  table  with  the  states  satisfying  the  relevant  prop¬ 
erties.  (The  first  line  has  been  filled  in  to  get  you  started.) 


property  P 

states 

satisfying  P 

negation  ~^P 

states 

satisfying  -i P 

(a)true 

X,Y 

[a]  false 

Z,  0 

[ajtrue 

(6)true 

[6]true 

(a)  (h)true 

(a)  [6]  true 

[a]  (6)true 

[a]  [6]  true 

5.  Consider  the  following  labelled  transition  system. 


Give  a  modal  logic  formula  which  distinguishes  between  and  X1. 
Argue  why  no  formula  of  smaller  modal  depth  can  distinguish  between 
these  two  states. 
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6.  Give  a  labelled  transition  system  with  a  state  s  which  satisfies  all  of 
the  following: 

•  (a)((a)true  A  (6)(a)true) 

•  {a){&)((6)true  A  [ajfalse) 

•  |a)(6)([a]false  A  [6] fa Ise) 

7.  Recall  the  specification  of  the  car  safety  system  from  Exercise  11  on 
page  305. 

(a)  Express  R(x )  in  the  modal  logic  M  in  two  ways: 

i.  one  way  involving  only  the  action  “ring”;  and 

ii.  another  way  not  involving  the  action  “ring”. 

(Hint:  First  express  R(x )  in  terms  of  D(x),  B[x)  and  M{x).) 

(b)  Which  states  satisfy  the  following  formulae? 

i.  (buckle)true  A  (close)true 

ii.  (buckle)true  A  [close]false 

iii.  (on)(ring)true 

iv.  [on](ring)true 

v.  (open)  )(buckle)true  A  (off) true) 

vi.  (open)  ((buckle)true  V  (off) true) 

8.  Prove  or  disprove  the  following.  (Here,  equality  between  formulae 
means  that  the  formulae  express  the  same  properties.) 

(a)  ( a)(PAQ )  =  (a)P  A  (a)Q. 

(b)  fa)(P  V  Q)  =  {a)P  V  {a)Q. 

(c)  [a](P  A  Q)  =  [a\P  A  [a\Q 

(d)  [a](P  V  Q)  =  [a]P  V  [a]Q. 

9.  As  defined,  the  modal  logic  HML  involves  only  binary  conjunctions 
and  disjunctions,  P  A  Q  and  P  V  Q,  and  thus  by  extension  finite  con¬ 
junctions  and  disjunctions,  A  P  and  V  P  for  finite  sets  of  formulae  F. 

Prove  that  if  we  allow  infinite  conjunctions  and  disjunctions,  then 
the  logical  characterisation  of  bisimulation  equivalence  is  tight:  that 
E  ^  F  if,  and  only  if,  E  and  F  satisfy  the  same  (infinitary)  modal 
logic  formulae. 

10.  Let  HMLt  be  the  subset  of  HML  formulae  generated  by  the  following 
BNF  equation: 

P  ::=  true  |  (a)P 

Show  that  HMLt  characterises  trace  equivalence  =t  from  Exercise  8 
on  page  330,  in  the  sense  that  E  —t  F  if,  and  only  if,  E  and  F  satisfy 
the  same  formulae  of  HML. 
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11.  Let  HMU  be  the  subset  of  HML  formulae  generated  by  the  following 
BNF  equation: 

P,Q  ::=  true  |  { a)P  \  PAQ 

Show  that  HMU  characterises  simulation  equivalence  x  from  Exer¬ 
cise  12.8  on  page  318,  in  the  sense  that  E  x  F  if,  and  only  if,  E  and 
F  satisfy  the  same  formulae  of  HMU. 

12.  Let  HML/  be  the  subset  of  HML  formulae  generated  by  the  following 
BNF  equation: 

P  ::=  [if]  false  |  (a)P 

where  K  C  E  is  a  set  of  actions.  Show  that  HML/  characterises  failures 
equivalence  =/  from  Exercise  10  on  page  331,  in  the  sense  that  E  =/  F 
if,  and  only  if,  E  and  F  satisfy  the  same  formulae  of  HML/. 


Chapter  14 

Concurrent  Processes 


Many  hands  make  light  work. 

-  John  Heywood. 

Thus  far  the  systems  that  we  have  considered  have  been  simple  sequen¬ 
tial  processes,  and  have  deviated  from  the  standard  (deterministic)  notion 
of  a  sequential  program  only  by  the  presence  of  (nondeterministic)  choice. 
Of  course  the  real  interest  in  the  study  of  systems  arises  when  we  permit 
processes  to  run  in  parallel  and  interact  with  one  another.  There  are  a 
variety  of  ways  in  which  one  might  introduce  operators  into  the  language 
to  permit  such  concurrent  process  behaviour.  In  this  chapter  we  introduce 
a  relatively  simple  operator,  referred  to  as  synchronisation  merge ,  and 
demonstrate  its  use  in  a  variety  of  example  applications. 


(l4.l)  Synchronisation  Merge 

In  this  section,  we  introduce  a  parallel  composition  operator  ||  which  allows 
two  processes  E  and  F  to  execute  in  parallel.  The  precise  fashion  in  which 
this  concurrent  execution  takes  place  must  be  defined;  in  particular,  we  must 
clearly  stipulate  in  what  fashion  such  concurrent  processes  may  interact  with 
one  another.  To  motivate  our  study,  we  start  with  a  simple  example. 


Example  14.  lj 

Suppose  we  have  a  very  simple  factory  employing  two  workers.  The  first 
worker  takes  in  jobs  one  at  a  time  and,  after  carrying  out  some  work  on 
a  job,  passes  it  on  to  the  second  worker  (assuming  the  second  worker  isn’t 
still  working  on  an  earlier  job).  The  second  worker  takes  jobs  one  at  a  time 
from  the  first  worker  and,  after  carrying  out  some  work  on  a  job,  sends  it 
out  of  the  factory. 

The  two  workers  can  be  represented  by  the  following  two  simple  processes 
P  and  Q : 


F.  Moller,  G.  Struth,  Modelling  Computing  Systems, 
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in 


p 

def  n 

=  in.  pass.P 

C  p  \ 

J^pass^P^ 

pass 

pass 

Q 

=f  pass. out.Q 

C 33 

out 


When  the  two  workers  start  their  work  day,  the  first  can  start  working 
immediately  by  taking  in  the  first  job,  represented  by  the  transition 

D  m 

P - 1  pass.P 


However,  the  second  worker  has  to  wait  until  the  first  worker  has  completed 
working  on  the  first  job  and  passes  it  on;  that  is,  the  transition 


Q 


pass 


out.Q 


cannot  take  place  in  reality  until  the  associated  transition 


pass.P 


pass 


takes  place.  The  two  workers  synchronise  on  the  pass  action;  they  must 
do  this  action  together. 

Consider  what  we  would  see  if  we  were  to  watch  these  two  workers. 
The  behaviour  of  these  two  processes  P  and  Q  running  together  would  be 
represented  by  the  following  process,  where  we  represent  the  two  relevant 
process  states  side-by-side  separated  by  parallel  lines  ||: 


in 


’ll  Q 

3 

<3  pass.P 

out 

t 

/ 

J  out 

d  p  ii 

out.Q 

<Q^~pass.P  | 

outQ~^Q) 

in 


Note  that  having  passed  a  job  on,  the  first  worker  can  take  in  another  job; 
however,  this  job  cannot  be  passed  on  until  the  second  worker  has  sent  out 
the  previous  job. 


In  the  above  example,  two  processes  P  and  Q  are  made  to  run  in  parallel. 
This  parallel  composition  is  written  as  P  ||  Q,  and  each  process  is  allowed 
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to  perform  certain  of  its  actions  independent  of  the  other,  but  is  forced 
to  synchronise  with  the  other  process  on  certain  other  actions.  With  this 
understanding,  we  are  now  prepared  to  explain  the  formal  definition  of  the 
parallel  composition  operator  |j. 

We  first  require  each  process  state  E  to  have  a  well-defined  synchro¬ 
nisation  sort  Sort (E),  denoting  the  subset  of  actions  of  the  process  on 
which  it  synchronises  with  other  processes;  every  state  of  a  given  process 
will  possess  the  same  sort.  The  synchronisation  sort  of  a  process  identifies 
those  actions  which  are,  in  essence,  external  to  the  process,  and  represent 
those  actions  through  which  the  process  communicates  with  other  processes 
via  synchronisation.  They  are  the  actions  by  which  processes  are  inter¬ 
connected. 

(^Example  14. 2^) 

Suppose  Sort(P)  =  {a,  b},  Sort(Q)  =  { a,b,c }  and  Sort (R)  =  {b,  c}.  Com¬ 
posing  these  processes  in  parallel  gives  a  system  P  ||  Q  ||  R  in  which  the 
individual  components  P,  Q  and  R  are  directly  connected  through  the  ac¬ 
tions  of  their  respective  synchronisation  sorts.  The  composed  system  can 
thus  be  viewed  schematically  as  follows: 


This  depicts  the  whole  system  as  consisting  of  the  three  physical  processes  P, 
Q  and  R  all  operating  independently.  The  behaviour  of  these  three  processes 
is  not  depicted  in  the  diagram,  but  they  are  inter-connected  through  the 
three  actions  a,  b  and  c,  which  may  be  thought  of  as  physical  ports.  The 
result  is  a  process  P  ||  Q  ||  R,  which  can  itself  be  composed  in  parallel  with 
further  processes,  with  the  synchronisation  sort  Sort(P  ||  Q  ||  R)  =  {a,  b,  c}: 

a 
b_ 
c 


P\\Q\\  R 


The  intention  of  the  synchronisation  sort  of  a  process  is  to  define  which 
actions  are  of  importance  when  it  comes  to  interaction;  the  individual  actions 
of  E  may  take  place  in  the  composition  E  |  F  so  long  as  they  are  not  in  the 
sort  of  the  process  F.  However,  E  must  synchronise  with  F  on  any  action 
from  the  sort  of  F  which  E  is  prepared  to  do.  That  is,  E  cannot  do  an 
action  a  e  Sort(T')  unless  F  itself  is  prepared  to  do  this  action,  in  which 
case  E  and  F  can  perform  this  action  in  synchrony.  Note  that  when  we 
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compose  two  processes  E  and  F,  the  sort  of  the  resulting  processes  is  the 
union  of  the  sorts  of  the  components: 

Sort(E  ||  F)  =  Sort (E)  U  Sort(F). 

With  this  understanding  in  place,  we  may  give  the  formal  semantic  defi¬ 
nition  of  the  synchronisation  merge  E  ||  F  of  processes  E  and  F.  There 
are  three  rules  governing  the  behaviour  of  E  ||  F: 

1.  one  which  stipulates  that  E  ||  F  may  perform  a  transition  of  E  as  long 
as  it  does  not  involve  an  action  from  the  synchronisation  sort  of  F; 

2.  one  which  stipulates  that  E  ||  F  may  perform  a  transition  of  F  as  long 
as  it  does  not  involve  an  action  from  the  synchronisation  sort  of  E; 
and 

3.  one  which  stipulates  that  E  ||  F  may  synchronise  on  the  performance, 
by  E  and  F  together,  of  an  action  in  either  (or  both)  of  their  synchro¬ 
nisation  sorts. 

Formally,  these  rules  are  as  follows. 

1.  If  E  A  E'  and  a  <£  Sort(F’)  then  E  ||  F  A  E'  |j  F. 

2.  If  F  A  F1  and  a  <£  Sort (E)  then  E  |]  F  A  E  |]  F'. 

3.  If  E  -A  E1  and  F  -A  F'  and  a  e  Sort (E)  U  Sort(F')  then  E  ||  F  -A 
E'  ||  F‘. 

One  further  point  to  make  is  that  two  equivalent  processes  must  have  the 
same  synchronisation  sort. 


Exercise  14.2 j  (Solution  on  page  483) 

Why  is  it  important  that  equivalent  processes  have  the  same  sort? 

Hint:  We  wish  to  make  sure  that  if  A  ~  B  then  A  ||  X  ~  B  ||  X,  that 
is,  there  should  be  no  effect  in  the  functioning  of  a  system  if  we  replace 
one  component  A  with  an  equivalent  component  B.  What  might  happen, 
though,  if  A  and  B  have  the  same  behaviour  but  different  synchronisation 
sorts? 


(l4.2)  Counters 

For  any  integer  k> 0,  a  k-counter  is  a  system  which  stores  an  integer  value 
between  0  and  k  (inclusively).  The  fc-counter  can  be: 

•  incremented,  as  long  as  its  value  is  less  than  fe; 
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•  decremented,  as  long  as  its  value  is  greater  than  0;  and 

•  tested  if  its  value  is  zero. 

For  example,  we  can  define  a  1-counter  C  by: 

C  =f  iszero.C  +  me. dec.  C 
which  defines  the  following  transition  system: 


dec 


Almost  as  simply,  we  can  define  a  2-counter  by: 

C2  =f  iszero. C2  +  inc.C2 
C'2  =  jnc.C"  +  dec.  C2 
C"  =  dec.C'2 

which  defines  the  following  transition  system: 


We  can  use  two  copies  of  the  simple  1-counter  to  “implement”  a  2-counter. 
Assuming  that  Sort(C)  =  {iszero},  the  transition  system  C  ||  C  is  depicted 
as  follows: 
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Here,  the  initial  state  C  ||  C  can  do  an  increment  action  me  in  two  ways, 
either  by  allowing  the  left-hand  1-counter  C  perform  this  action,  or  by  al¬ 
lowing  the  right-hand  1-counter  C  perform  it;  as  this  action  is  not  in  the 
synchronisation  sort  of  C,  neither  process  will  block  the  other  from  perform¬ 
ing  this  action. 

On  the  other  hand,  the  iszero  action  is  only  possible  in  the  initial  state 
C  ||  C;  as  the  action  iszero  is  in  the  synchronisation  sort  of  C,  both  compo¬ 
nents  of  the  parallel  composition  must  participate  in  this  action. 

Generalising  this  result,  we  can  show  that  a  fc-counter,  for  any  k,  can  be 
implemented  by  combining  k  copies  of  the  simple  1-counter  in  parallel;  that 
is, 

ck  ~  c  II  c  II  •••  II  c. 

k  copies 

(^Exercise  14.3^)  (Solution  on  page  484) 

Prove  that  C2  ~  C  ||  C. 


(l4.3)  Railway  Level  Crossing 

Consider  the  railway  level  crossing  depicted  in  Figure  14.1.  This  system 
consists  of  three  components  working  in  parallel. 

•  A  Rail  process,  which  represents  the  arrival  of  trains,  assuring  that 
they  only  cross  if  the  signal  is  green. 
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•  A  Road  process,  which  represents  the  arrival  of  cars,  assuring  that  they 
only  cross  if  the  barrier  is  up. 

•  A  Controller  process,  which  regulates  the  signal  and  barrier,  assuring 
that  the  barrier  is  never  up  at  the  same  time  that  the  signal  is  green. 

This  is  a  typical  example  of  a  control  system,  in  which  a  controller  process  is 
regulating  the  behaviour  of  other  processes  in  order  to  prevent  undesirable 
behaviours.  The  desirable  properties  which  the  controller  would  like  to 
attain  are  of  two  kinds. 

1.  Safety  Properties:  (No  crashes ) 

•  A  car  may  not  cross  at  the  same  time  as  a  train. 

2.  Liveness  Properties:  (Eventual  service ) 

•  If  a  car  arrives,  eventually  the  barrier  goes  up. 

•  If  a  train  arrives,  eventually  the  signal  turns  green. 

We  shall  now  describe  the  behaviour  of  the  three  component  processes. 

def 

1.  Road  =  car .up . ccross . down. Road, 
with  Sort(Road)  =  {up,  down}. 

The  Road  process  repeatedly  carries  out  the  following  events. 

(a)  signals  the  arrival  of  a  car  at  the  crossing  (the  car  action); 

(b)  witnesses  the  raising  of  the  barrier  (the  up  action); 

(c)  signals  the  crossing  of  the  car  (the  ccross  action);  and  finally 

(d)  witnesses  the  lowering  of  the  barrier  (the  down  action). 

The  Road  process  is  thus  depicted  by  the  following  transition  system. 


car 


2.  Rail  =  train .green. tcross . red. Rail, 
with  Sort(Rail)  =  {green,  red}. 

Analogous  to  the  Road  process,  the  Rail  process  repeatedly  carries  out 
the  following  events. 
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(a)  signals  the  arrival  of  a  train  at  the  crossing  (the  train  action); 

(b)  witnesses  the  signal  turning  green  (the  green  action); 

(c)  signals  the  crossing  of  the  train  (the  tcross  action);  and  finally 

(d)  witnesses  the  signal  turning  red  (the  red  action). 

The  Rail  process  is  thus  depicted  by  the  following  transition  system. 

train 


tcross 


3.  Controller  =  green. red. Controller  +  up. down. Controller, 
with  Sort(Controller)  =  {up,  down,  green,  red}. 

The  Controller  process  is  thus  depicted  by  the  following  transition 
system. 


up  green 


down  red 


The  complete  railway  level  crossing  system  then  consists  of  the  above 
three  components  executing  in  parallel: 

Crossing  =f  Road  ||  Controller  ||  Rail 

Its  structure  can  be  depicted  as  follows. 


up 

green 

Road 

Controller 

Rail 

down 

red 

Its  behaviour  is  thus  given  by  the  following  transition  system. 
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(^Exercise  14.4)  (Solution  on  page  484) 

Do  the  desired  safety  and  liveness  properties  mentioned  above  hold?  Explain 
why  or  why  not.  If  any  of  these  properties  fail, 

•  can  you  propose  a  weaker  yet  acceptable  property  which  does  hold? 

•  can  you  propose  a  way  to  alter  the  definitions  of  the  components  of 
the  system  so  that  the  property  does  hold? 


(l4.4)  Mutual  Exclusion 

When  two  tasks  are  being  carried  out  together,  problems  can  occur  if  they 
want  to  access  some  shared  resource  at  the  same  time.  A  striking  illustration 
of  this  is  the  Clayton  Tunnel  Accident  (page  2),  wherein  one  train  was 
allowed  to  enter  the  tunnel  which  was  currently  occupied  by  another  train. 
Mutual  exclusion  refers  to  the  problem  of  ensuring  that  two  processes  can 
never  be  in  their  critical  section  -  ie,  using  a  shared  resource  such  as  a 
shared  memory  or  printer  -  at  the  same  time.  If  a  process  is  granted  use  of 
such  a  shared  resource,  it  must  be  allowed  to  maintain  exclusive  use  of  this 
resource  until  it  has  completed  its  use  and  released  it.  This  is  a  ubiquitous 
problem  in  the  design  of  concurrency  systems. 

14.4.1  Dining  Philosophers 

The  problem  of  mutual  exclusion  was  first  identified  and  solved  in  1965 
by  Edsger  W.  Dijkstra  who  also  proposed  the  following  illustration  of  how 
a  system  can  deadlock  if  due  concern  is  not  taken  to  the  use  of  shared 
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resources.  It  is  a  very  simple  problem  to  consider,  yet  offers  a  wealth  of 
insight  into  the  challenges  posed  by  synchronisation. 

Imagine  there  are  five  philosophers  sitting  at  a  round  dining  table  think¬ 
ing.  Each  philosopher  has  a  plate  of  spaghetti  in  front  of  them,  and  there  is 
a  fork  between  every  pair  of  plates.  Figure  14.2  depicts  the  situation.  The 
spaghetti  is  hopelessly  tangled,  meaning  that  a  philosopher  must  use  two 
forks  together  to  eat  ith  A  philosopher  may  only  use  the  two  forks  that 
are  on  either  side  of  their  own  plate,  which  they  pick  up  one  at  a  time  in 
either  order.  After  taking  a  mouthful  of  spaghetti,  the  philosopher  then 
replaces  the  two  forks,  in  either  order,  to  where  they  were  lifted  from  the 
table.  As  the  philosophers  are  deep  in  their  own  thoughts,  at  no  point  do 
they  communicate  with  one  another. 

Our  task  is  to  design  a  protocol  -  that  is,  model  the  interactions  between 
the  philosophers  and  the  forks  -  which  satisfies  the  following  correctness 
properties: 

1.  A  philosopher  eats  only  when  holding  two  forks. 

^The  story  is  often  described  with  rice  rather  than  spaghetti,  and  chopsticks  in  place  of 
forks,  making  it  more  immediate  that  two  utensils  are  needed  to  eat. 
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2.  No  two  philosophers  may  hold  the  same  fork  simultaneously  (mutual 
exclusion). 

3.  The  philosophers  never  get  stuck,  with  every  one  of  them  forever  wait¬ 
ing  for  some  fork  to  become  available  (deadlock  freedom). 

To  model  this  problem,  we  introduce  the  following  actions: 

•  eat,:  philosopher  i  takes  a  bite  to  eat. 

•  philosopher  i  picks  up  fork  j. 

•  drop j-:  philosopher  i  puts  down  fork  j. 

The  behaviour  of  the  forks  is  easy  to  describe: 

F1  =  lifts!  ■  drop51 .  Fx  +  liftn  .  dropu  .  F4 

def 

F2  =  lift12  ■  drop12 .  F2  +  lift22 .  drop22 .  F2 

def 

F3  =  lift23 .  drop23 .  F3  +  lift33 .  drop33 .  F3 

def 

F4  =  lift34  .  drop34  .  F4  +  lift4i  .  dropi4  .  F4 

F5  =  lifti5  ■  drop45 .  F5  +  liftss  •  drop5 5 .  F5 

That  is  to  say,  a  fork  is  picked  up  by  one  of  the  two  philosophers  nearest  to 
it  at  the  table,  and  is  subsequently  placed  back  on  the  table  by  that  same 
philosopher. 

We  have  some  freedom  in  how  to  define  the  behaviour  of  a  philosopher, 
in  that  it  is  unspecified  which  order  they  pick  up  and  set  down  their  forks. 
As  a  first  attempt,  we  assume  that  they  each  pick  up  the  fork  to  their  right 
first  (as  well  as  set  this  one  down  first): 

def 

Pi  =  liftn  .  lift12  ■  eati .  dropu  .  drop12 .  Pi 

P2  =f  lift22 .  lift23  ■  eatv  .  drop22 .  drop23 .  P2 

def 

P3  =  lift33 .  lift34  .  eat3 .  drop33 .  drop34  .  P3 

P4  =f  lift44  .  lift45 .  eat4  .  dropi4  .  drop45 .  P4 

P5  =f  liftss  •  lift 5i  •  edt5 .  drop55 .  dropsl .  P5 

The  synchronisation  sorts  for  forks  and  philosopher  processes  are  defined 
naturally  as  follows: 

Sort(Pi)  =  {  hfti:j ,  dropt]  :  1  <  j  <  5  }. 

Sort(P,)  =  {  lift.. ,  drop^  :  1  <  i  <  5  }. 

Unfortunately  this  protocol  has  the  possibility  of  deadlocking:  every 
philosopher  may  pick  up  a  fork  with  their  right  hand  before  any  one  of 
them  picks  up  the  fork  to  their  left,  at  which  time  they  will  all  be  waiting 
forever  for  their  left-hand  neighbour  to  return  the  fork  to  the  table. 

We  can  resolve  this  problem  by  changing  the  definition  of  the  first  (and 
only  the  first)  philosopher,  who  is  required  to  pick  up  the  fork  to  their  left 
first: 
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P\  =  lift12 .  lift11  ■  eati .  drop12 .  drop21  .  P\ 

With  some  thought,  it  becomes  apparent  that  this  refined  protocol  cannot 
deadlock. 

(^Exercise  14.5^)  (Solution  on  page  484) _ 

Argue  that  the  refined  protocol,  in  which  the  first  philosopher  picks  up  the 
left  fork  first,  cannot  deadlock. 


14.4.2  Peterson’s  Algorithm 

There  have  been  various  solutions  proposed  for  dealing  with  the  mutual 
exclusion  problem.  Here  we  examine  an  elegant  solution  proposed  by  Gary 
L.  Peterson  in  1981. 

We  consider  two  processes,  P1  and  P2,  both  of  which  wanting  at  times 
to  enter  some  critical  section.  There  are  two  Boolean  variables:  bl  which  is 
true  if  P1  wants  to  enter  (or  is  in)  the  critical  section,  and  b2  which  is  true 
if  P2  wants  to  enter  (or  is  in)  the  critical  section;  and  a  variable  k  which  has 
value  1  or  2  indicating  which  process  has  “ownership"  of  (ie,  the  authority 
to  enter)  the  critical  section.  The  Boolean  variables  bl  and  b2  are  initially 
set  to  false,  while  the  initial  value  of  k  is  arbitrary. 

The  two  processes  are  then  defined  as  follows  (where  the  actual  details 
of  the  critical  and  noncritical  sections  are  left  unspecified). 


Pi',  while  true  do 

P2 :  while  true  do 

•  •  •  noncritical  section ■  ■  ■ 

•  •  •  noncritical  section ■  ■  ■ 

bl  :=  true; 

b2  :=  true; 

k  :=  2; 

ii 

while  (b2  and  k=2)  do 

while  (bl  and  k=l)  do 

skip; 

skip; 

•  •  •  critical  section  ■  ■  ■ 

•  •  •  critical  section  ■  ■  ■ 

bl  :=  false 

b2  :=  false 

When  process  P1  wishes  to  enter  the  critical  section,  it  indicates  this  by 
setting  bl  to  true,  but  also  sets  k  to  2  granting  authority  to  the  other 
process  P2  to  enter  the  critical  section.  It  then  waits  until  either  the  other 
process  P2  does  not  wish  to  enter  the  critical  region  (ie,  b2  is  false)  or  the 
other  process  grants  it  authority  to  enter  the  critical  region  (ie,  k  has  value 
1),  at  which  time  it  enters  the  critical  region;  when  it  exits  the  critical  region 
it  indicates  this  by  setting  bl  to  false.  Process  P2  is  defined  in  an  identical 
fashion. 

To  model  these  processes  as  labelled  transition  systems,  we  first  need  to 
represent  the  variables  bl,  b2  and  k  themselves  as  processes  which  interact 
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with  the  processes  P i  and  P2.  The  variable  bl  is  represented  by  the  following 
two-state  system: 


Bjf  =  birf.Bjf  +  bjwf.Bjf  +  bjwt .  Bjt 
Bjt  =f  bjrt .  Bjt  +  bjwt.Bjt  +  biWf.Bjf 

Sort(B!f)  =  {  birf ,  birt ,  biwf ,  biwt } 


birf 

birt 

Q. 

biwt 

Cl 

©■ 

\  1 V 

rj 

bxwf 

Vj 

biwf 

biwt 

The  state  Bif  (“f”  for  “false")  signifies  that  the  variable  bl  has  the  value 
false,  while  the  state  Bjt  (“t”  for  “true”)  signifies  that  the  variable  bl  has 
the  value  true. 

•  Processes  read  the  value  of  the  variable  by  synchronising  with  the 
process  on  the  actions  birf  and  birt  ("r”  for  "read”):  the  action  birf 
represents  the  process  telling  the  environment  that  the  value  of  bl  is 
false,  while  the  action  bjrt  represents  the  process  telling  the  environ¬ 
ment  that  the  value  of  bl  is  true.  The  state  of  the  variable  process 
does  not  change  on  these  actions. 

•  Processes  write  a  value  to  the  variable  by  synchronising  with  the  pro¬ 
cess  on  the  actions  bjwf  and  bjwt  (”w”  for  "write”):  the  action  bjwf 
(“w”  for  “write”)  represents  the  value  of  the  variable  bl  being  set  to 
false,  while  the  action  bjwt  represents  the  value  of  the  variable  bl  being 
set  to  true.  The  state  of  the  variable  process  changes  as  appropriate 
on  these  actions. 

All  of  the  reading  and  writing  actions  are  included  in  the  sort  of  the  process, 
as  clearly  these  actions  must  be  done  in  synchrony  with  this  process. 

The  variable  b2  has  an  analogous  definition: 


B2f  =f  b2rf.B2f  +  b2wf.B2f  +  b2wt .  B2t 

b2rf 

b2wt 

b2rt 

B2t  =f  b2rt .  B2t  +  b2wt .  B2t  +  b2wf.B2f 

Sort(B2f)  =  {  b2rf ,  b2rt ,  b2wf ,  b2wt } 

b2wf 

b2wf 

b2wt 

Finally,  the  variable  k  is  similarly  defined: 


Ki  =  krl.Kj  +  kwl.Ki  +  kw2 .  K2 
K2  =  kr2  .  K2  +  kw2  .  K2  +  kwl.Ki 
Sort^Ki)  =  {  krl,  kr2,  kwl,  kw2  } 


kwl 


kw2 
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Again,  the  variable  k  is  represented  by  a  two-state  process,  representing  the 
two  values  that  the  variable  k  can  hold  (either  1  or  2);  and  there  are  actions 
representing  the  reading  and  writing  of  these  two  values. 

We  now  turn  our  attention  to  defining  the  processes  Py  and  P2.  As  the 
behaviour  of  the  processes  within  the  noncritical  and  critical  sections  are 
irrelevant  for  our  study  -  we  are  only  interested  in  ensuring  that  mutual 
exclusion  is  attained  -  we  ignore  these  completely.  The  behaviour  of  the 
process  P1  thus  starts  with  setting  the  value  of  bl  to  true  and  the  value  of 
k  to  2: 

Py  =  bjwt .  kw2.Wj 

The  process  Wy  represents  the  process  at  the  point  of  executing  the  while 
loop  waiting  to  enter  the  critical  section: 

while  (b2  and  k=2)  do  skip 

The  process  will  stay  in  the  state  Wy  for  as  long  as  the  value  of  b2  is  true 
(ie,  the  action  b2rt  can  occur)  and  the  value  of  k  is  2  (ie,  the  action  kr2  can 
occur).  However,  if  either  of  these  is  false,  that  is  if  the  value  of  b2  is  false 
(ie,  the  action  b2rf  can  occur)  or  the  value  of  k  is  1  (ie,  the  action  krl  can 
occur),  then  the  process  will  move  into  a  new  state  Ry  signifying  that  the 
process  is  ready  to  enter  the  critical  section: 

Wy  =  b2rt .  Wy  +  kr2  .  Wy  +  b2rf.i?i  +  krl .  Ry 

Finally,  in  the  state  Ry  the  process  will  enter  the  critical  section,  then  (ul¬ 
timately)  exit  it,  and  set  the  value  of  the  variable  bl  to  be  false,  before 
returning  then  to  the  initial  state: 

def 

Ry  =  enter .  exit .  bywf .  Py 

The  synchronisation  sort  of  the  process  Py  contains  the  three  relevant  writing 
events: 

Sort^)  =  {bjwt,  bjwf,  kw2}, 

as  the  variables  can  only  change  value  if  they  are  written  to. 

The  process  P2  is  defined  analogously  to  process  Py: 

P2  =f  b2wt .  kwl .  W2 

W2  =f  birt.W2  +  krl .  W2  +  byif.R2  +  kr2 .  R2 

def 

R2  —  enter .  exit .  b2wf.  P2 
Sort(P2)  =  {b2wt,  b2wf,  kwl} 

The  two  processes  Py  and  P2  are  then  depicted  by  the  following  labelled 
transition  systems: 
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The  whole  system  is  then  the  concurrent  composition  of  the  processes 
P1  and  P2  with  the  variable  processes: 

Peterson  =  P1  ||  P2  ||  Bjf  ||  B2f  ||  Kj. 


(^Exercise  14.6J  (Solution  on  page  485) 


Argue  that  the  two  processes  P2  and  P2  can  never  both  be  in  the  critical 
section  at  the  same  time. 


(l4.5)  A  Message  Delivery  System 

We  now  wish  to  specify  a  simple  message  delivery  system,  which  models 
the  sending  of  a  message  by  a  Sender  process  to  a  Receiver  process.  The 
Sender  and  Receiver  are  not  directly  connected;  rather,  the  message  is 
routed  through  some  Medium.  For  example,  the  Sender  and  Receiver 
may  be  two  devices  on  a  local  area  network  connected  by  an  Ethernet; 
or  they  may  be  computers  on  opposite  sides  of  the  globe  connected  by  a 
complex  mesh  of  links  between  routers.  In  our  simple  system,  we  ignore 
the  actual  content  of  the  message  being  sent,  as  well  as  the  address  of  the 
Receiver,  as  there  will  only  be  a  single  Receiver  process. 

When  the  Sender  accepts  a  message  to  send  to  the  Receiver  (modelled 
by  an  “in”  action),  it  sends  the  message  to  the  Medium  (modelled  by  a 
“snd”  action),  and  awaits  an  acknowledgement  that  the  message  has  been 
successfully  delivered  to  the  Receiver  (modelled  by  an  “ack”  action);  when 
an  acknowledgement  is  received,  the  Sender  will  be  ready  to  accept  the  next 
message  to  send.  It  may  be  the  case  that  the  message  is  lost  or  corrupted 
by  the  Medium,  in  which  case  the  Medium  signals  to  the  Sender  that 
a  fault  has  occurred  (modelled  by  an  “err”  action);  this  typically  occurs 
in  practice  through  a  time-out  mechanism.  The  Sender  responds  to  this 
fault  by  re-transmitting  the  message  to  the  Medium.  The  behaviour  of  the 
Sender  is  thus  modelled  by  the  process  Sender  defined  as  follows: 
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Sender  =  in.snd.S 

def 

S  =  ack. Sender  +  err.snd.S 

Sort(Sender)  =  {snd,  ack,  err} 

Its  transition  system  is  depicted  thus: 


in 


The  Medium  accepts  a  message  from  the  SENDEFt(via  the  snd  action), 
and  either  delivers  it  to  the  Receiver  (modelled  by  a  rev  action)  and 
returns  to  its  original  state  to  await  the  next  message  to  be  sent,  or  it 
signals  to  the  Sender  that  a  fault  has  occurred  (via  the  err  action)  and 
again  returns  to  its  original  state  to  await  the  retransmission  of  the  previous 
message.  The  behaviour  of  the  Medium  is  thus  modelled  by  the  process 
Medium  defined  as  follows: 

Medium  =f  snd. (rev. Medium  +  err. Medium) 

Sort(Medium)  =  {snd,  rev,  err} 

Its  transition  system  is  depicted  thus: 


rev 


err 


Finally,  the  Receiver  awaits  the  delivery  of  a  message  (via  the  rev  ac¬ 
tion),  and  outputs  the  message  (modelled  by  an  out  action)  before  issuing 
an  acknowledgement  (via  the  ack  action)  that  the  message  has  been  suc¬ 
cessfully  received  and  delivered.  Its  behaviour  is  modelled  by  the  process 
Receiver  defined  as  follows: 

def 

Receiver  =  rev. out. ack. Receiver 
Sort(Receiver)  =  {rev,  ack} 

The  transition  system  is  depicted  thus: 
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rev 


The  complete  system  is  defined  to  be  the  composition  of  these  three 
components: 

System  =f  Sender  ||  Medium  |j  Receiver 
and  has  the  following  configuration: 


Note  that  in  this  simple  model,  the  acknowledgement  is  communicated  di¬ 
rectly  from  the  Receiver  to  the  Sender,  which  is  unrealistic  given  the 
purpose  of  the  System  to  relay  messages  between  them.  In  reality,  the  ac¬ 
knowledgement  would  be  routed  through  the  Medium  resulting  in  a  second 
phase  which  is  identical  to  the  first  but  with  the  roles  of  the  Sender  and 
Receiver  reversed. 

The  behaviour  of  the  complete  system  is  thus  depicted  by  the  transition 
system  depicted  in  Figure  14.3. 


Exercise  14.7 J  (Solution  on  page  485) _ 

Enhance  the  simple  message  passing  system  so  that  acknowledgements  are 
routed  through  the  Medium  from  the  Receiver  to  the  Sender.  Don’t 
neglect  the  possibility  of  acknowledgements  being  lost. 


(l4.6)  Alternating  Bit  Protocol 


The  message  passing  system  in  the  previous  section  is  an  example  of  a  very 
important  concept  in  communication  networks:  that  of  communication 
protocols. 
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Figure  14.3:  The  message-passing  system. 


When  you  click  on  a  link  in  your  favourite  browser  to  a  Web  site  on  the 
opposite  side  of  the  globe,  or  send  an  email  to  your  friend  who  is  perhaps 
thousands  of  miles  away,  a  complicated  procedure  is  carried  out  between 
dozens  of  computers  in  relaying  your  message  to  its  destination  (either  the 
computer  hosting  the  Web  site  you  are  wanting  to  access,  or  the  computer 
on  which  your  friend  reads  email).  Your  message  gets  relayed,  bit-by-bit, 
through  a  long  chain  of  intermediate  computers  as  it  gets  routed  towards  its 
destination.  At  any  point  in  this  chain,  a  bit  of  your  message  can  get  lost  in 
transmission,  and  the  particular  computer  which  sent  the  bit  that  got  lost 
needs  to  know  that  the  bit  was  lost  so  that  it  can  retransmit  it. 

Of  course,  one  computer  cannot  tell  another  computer  that  it  didn’t  get 
a  message,  as  it  wouldn’t  know  that  it  was  supposed  to  get  one;  and  in  fact 
the  most  common  cause  of  a  message  being  lost  in  transmission  is  due  to 
a  receiving  computer  being  broken,  and  thus  unable  to  receive  the  message 
or  send  an  acknowledgement.  Thus,  when  messages  are  passed  from  one 
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computer  to  another,  the  sending  computer  will  wait  for  an  acknowledge¬ 
ment  from  the  receiving  computer;  if  this  doesn’t  come  within  a  reasonable 
amount  of  time,  the  sending  computer  will  assume  that  the  message  got  lost 
and  retransmit  it.  Of  course,  it  may  be  the  acknowledgement  that  got  lost: 
the  receiving  computer  may  receive  a  message  and  send  an  acknowledge¬ 
ment  and  subsequently  receive  the  same  message  again.  In  this  case,  the 
receiver  will  assume  that  its  acknowledgement  was  lost,  leading  the  sender 
to  retransmit  the  message,  in  which  case  the  receiver  will  retransmit  the 
acknowledgement . 

There  are  very  many  different  communication  protocols  implemented  on 
computers  carrying  out  the  above  task.  In  this  section  we  consider  a  common 
simple  protocol:  the  alternating  bit  protocol.  This  protocol  again  involves 
a  Sendee  and  a  Receiver  communicating  through  a  Medium,  and  works 
as  follows. 

•  The  Sender  accepts  a  message  to  be  sent  to  the  Receiver  (mod¬ 
elled  by  an  “in”  action),  and  sends  it  into  the  Medium  tagged  with 
a  protocol  bit  0  or  1  (modelled  by  the  actions  “s0”  and  “si”,  respec¬ 
tively).  It  then  awaits  an  acknowledgement  from  the  Medium  tagged 
by  the  same  protocol  bit  (modelled  by  the  actions  “ack0”  and  “ackj” , 
respectively). 

When  the  Sender  receives  an  acknowledgement  tagged  by  the  correct 
protocol  bit,  it  flips  the  protocol  bit  and  repeats  the  protocol  for  the 
next  message. 

If  the  Sender  receives  an  acknowledgement  tagged  by  the  wrong  pro¬ 
tocol  bit,  or  if  it  times  out  waiting  for  the  acknowledgement  to  arrive 
(modelled  by  a  “t”  action),  it  retransmits  the  message  (again  with  the 
corresponding  bit  attached). 

The  behaviour  of  the  Sender  is  thus  defined  by  the  following  process: 


Sender  =f  50 

Sort(SENDER)  =  {s0,  Si} 

ry  def  .  fyi 

b0  =  ln.^o 

So 

=f  s0.(acko.5!  +  acki.SQ  +  t.Sg 

Si  =  in. 

Si 

=f  Si.(acki.So  +  acko-SJ  +  t.S{ 
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The  synchronisation  sort  of  the  Sender  process  contains  the  actions  s0 
and  sj,  as  the  only  way  the  Medium  could  receive  a  message  is  through 
a  communication  with  the  Sender;  it  can  only  do  these  actions  if  and 
when  the  Sender  does  them. 

•  When  the  Receiver  receives  a  message  from  the  Medium  tagged  by 
the  expected  protocol  bit  (modelled  by  the  actions  “r0”  and  “ri” ,  re¬ 
spectively),  it  outputs  the  message  (modelled  by  an  “out”  action)  and 
sends  an  acknowledgement  into  the  Medium  tagged  by  that  protocol 
bit  (modelled  by  the  actions  “rack0”  and  “racki”,  respectively). 

The  Receiver  then  awaits  a  new  message  tagged  by  the  opposite 
protocol  bit,  with  which  it  will  repeat  this  protocol.  In  the  meantime, 
it  will  acknowledge  any  further  messages  tagged  by  the  old  bit. 

The  behaviour  of  the  Receiver  is  thus  defined  by  the  following  pro¬ 
cess: 


Receiver  =  R0  Sort(RECEivER) 

def 

Ro  =  ro.out.racko.Ri  +  ri.  racki.  i?o 

def 

Ri  =  ri.out.racki.Ro  +  ro.racko.Ri 


{rack0,  racki}. 


The  synchronisation  sort  of  the  Receiver  process  contains  the  ac¬ 
tions  rack0  and  racki,  as  the  only  way  the  Medium  could  receive  an 
acknowledgement  is  through  a  communication  with  the  Receiver;  it 
can  only  do  these  actions  if  and  when  the  Receiver  does  them. 

•  The  Medium  merely  passes  messages  from  the  Sender  to  the  Re¬ 
ceiver  and  acknowledgements  from  the  Receiver  to  the  Sender. 
Its  behaviour  is  defined  by  the  following  process: 

MEDIUM  =f  M  Sort(MEDIUM)  =  {r0,  ri,  ack0,  acki}. 

def 

M  =  s0.r  q.M  +  s  i.Ti.M  +  rack0.ack0.M  +  racki. acki. M 
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The  synchronisation  sort  of  the  Medium  process  does  not  contain  the 
actions  s0  and  si;  the  Sendee  may  send  a  message  without  it  being 
received  by  the  Medium.  Nor  does  it  contain  the  actions  rack0  and 
rackj;  the  Receiver  may  send  an  acknowledgement  without  it  being 
received  by  the  Medium.  However,  it  does  contain  the  actions  r0  and 
ri,  as  the  Receiver  can  only  receive  a  message  from  the  Medium;  as 
well  as  the  actions  ack0  and  ackj,  as  the  Sender  can  only  receiver  a 
message  from  the  Medium. 

The  complete  system  is  defined  to  be  the  composition  of  these  three 
components 

System  =f  Sender  ||  Medium  ||  Receiver 
and  has  the  following  configuration: 


Its  complete  transition  system  is  large,  but  by  considering  it  carefully,  it  can 
be  verified  that  the  in  and  out  actions  occur  in  an  alternating  fashion  and, 
equally  important,  that  the  protocol  can  never  deadlock. 


Exercise  14.8 J  (Solution  on  page  488) _ 

Argue  that  the  in  and  out  actions  occur  in  alternating  fashion  in  the  alter¬ 
nating  bit  protocol,  and  that  the  protocol  can  never  deadlock. 


(l4.7)  Additional  Exercises 


1.  (a)  Give  a  definition  for  a  3-counter  C3,  and  draw  its  labelled  transi¬ 
tion  system. 
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(b)  Draw  the  labelled  transition  system  for  C  ||  C  |  C,  where  C  is 
the  1-counter  given  in  Section  14.2. 

(c)  Prove  that  C3  ~  C  ||  C  ||  C. 

2.  Normally,  a  barrier  at  a  railway  level  crossing  remains  up  until  a  train 
arrives;  this  signals  the  controller,  which  then  lowers  the  barrier,  then 
turns  the  signal  green,  then  turns  the  signal  red  again,  and  finally 
raises  the  barrier  once  again.  Such  a  controller  C  is  thus  represented 
by  the  following  LTS: 


(a)  Give  a  definition  for  C.  This  includes  defining  its  sort  /1(C). 

(b)  Give  the  definitions  and  associated  LTS  for  the  new  Road  and 
Rail  systems  Ro  and  Ra,  respectively,  which  correspond  to  the 
new  Controller  C.  (Keep  in  mind  that  the  new  Road  system  starts 
in  a  state  where  the  barrier  is  up;  and  the  new  Rail  system  must 
signal  the  controller  when  a  train  arrives,  using  the  new  event 
“signal”  common  to  their  sorts.) 

(c)  Now  consider  the  liveness  properties  again. 

i.  Is  it  now  the  case  that,  if  the  barrier  is  down  when  a  car 
arrives,  then  the  barrier  will  eventually  go  up? 

ii.  Is  it  now  the  case  that,  if  the  signal  is  red  when  a  train  arrives, 
then  the  signal  will  eventually  turn  green? 


3.  Argue  that  the  process  for  Peterson’s  Algorithm  can  never  deadlock. 


4.  Model  Dekker’s  Algorithm  for  mutual  exclusion,  as  outlined  as  follows. 
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Pi',  while  true  do 

•  •  •  noncritical  section •  •  • 
bl  :=  true; 

while  b2  do 
if  k=2  then 
bl  :=  false 
while  k=2  do  skip 
bl  :=  true 

•  •  •  critical  section  ■  ■  ■ 
k  :=  2; 

bl  :=  false 


P2:  while  true  do 

•  •  •  noncritical  section ■  ■  ■ 
b2  :=  true; 

while  bl  do 
if  k=l  then 
b2  :=  false 
while  k=l  do  skip 
b2  : =  true 

•  •  •  critical  section ■  ■  ■ 
k  :=  1; 

b2  :=  false 


5.  Argue  that  the  complete  alternating  bit  protocol  system  can  never 
deadlock,  and  that  the  in  and  out  actions  alternate  as  desired. 

6.  Show  that  the  operator  |]  is  commutative,  by  showing  that  the  transi¬ 
tion  systems  defined  by  E  |]  F  and  F  |  E  are  isomorphic  (ie,  identical, 
disregarding  the  -  irrelevant  -  labels  of  the  states). 

7.  Show  that  the  operator  ||  is  not  associative,  by  showing  that  (F  ||  F)  | 
F  oo  E  ||  (F  ||  F ),  where  E  a.O  with  Sort(F)  =  {a}  and  F  a. 6.0 
with  Sort(F)  =  { b }. 

What  restriction  on  synchronisation  sorts  would  make  this  operator 
associative? 

8.  Consider  a  new  parallel  operator  E  \  F  defined  by  the  following  tran¬ 
sition  rules: 

(a)  If  E  A  E1  and  a  £  Sort (F)  then  E  \  F  A  E'  \  F. 

(b)  If  F  A  F1  and  a  £  Sort(F)  then  E  \  F  A  E  \  F' . 

(c)  If  E  A  E'  and  F  A  F'  and  a  e  Sort(F)  n  Sort(F)  then  E  \ 

F  A-  E'  \  F'. 

This  is  identical  to  the  synchronisation  merge  except  that  the  transi¬ 
tion  rule  for  synchronising  processes  requires  the  action  on  which  the 
processes  synchronise  to  be  in  the  sorts  of  both  processes  rather  than 
just  one  of  them. 

Show  that  the  operator  |  is  both  commutative  and  associative. 


Chapter  15 


*  Temporal  Properties 


When  I  eventually  met  Mr  Right  I  had  no  idea  that  his  first  name 
was  Always. 

-  Rita  Rudner. 

The  modal  logic  HML  of  Chapter  13,  while  faithfully  characterising  prop¬ 
erties  which  are  relevant  for  distinguishing  between  process  behaviours,  has 
a  fundamental  drawback:  a  given  formula  P  G  HML  can  only  explore  the 
initial  behaviour  of  a  process,  namely  its  first  k  steps  where  k  —  md  ( P )  is 
the  modal  depth  of  the  formula.  We  cannot  write  a  single  formula  that  will 
explore  a  process  to  an  unbounded  length  of  its  execution. 

Consider,  for  example,  the  property  of  being  deadlockable.  In  various 
examples  of  real-world  system  verifications,  a  common  question  is  whether 
or  not  the  system  in  question  might  at  some  point  in  time  deadlock,  that 
is,  reach  a  state  from  which  no  action  is  possible.  For  example,  we  might 
like  to  verify  that  a  new  operating  system  design  can  never  get  in  to  a 
deadlocked  state,  one  in  which  the  machine  on  which  it  is  running  simply 
“hangs”  leaving  the  user  to  apply  the  age-old  solution  of  turning  it  off  and 
on  again. 

Such  properties  are  referred  to  as  temporal  properties,  as  they  refer 
to  the  long-term  behaviour  of  a  system  throughout  the  lifetime  of  its  exe¬ 
cution.  Note  that,  since  these  properties  are  still  based  on  the  behaviour 
of  systems,  any  such  property  which  is  true  of  a  given  process  will  be  true 
of  any  equivalent  process.  These  properties  typically  fall  under  one  of  the 
following  two  categories: 

•  Safety  properties  assert  that  nothing  bad  ever  happens.  Typical 
examples  of  safety  properties  include:  the  operating  system  will  never 
deadlock;  or  a  car  will  never  be  able  to  enter  a  level  crossing  at  the 
same  time  as  a  train. 

•  Liveness  properties  assert  that  something  good  eventually  hap¬ 
pens.  Typical  examples  of  liveness  properties  include:  having  pressed 
the  elevator  button  the  elevator  will  eventually  arrive;  or  if  a  train 
arrives  its  signal  will  eventually  turn  green. 
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In  this  chapter  we  will  explore  various  standard  temporal  properties, 
as  well  as  the  means  to  define  our  own  temporal  properties  from  recursive 
equations  involving  operators  of  the  modal  logic. 


(l5.l)  Three  Standard  Temporal  Operators 

A  variety  of  fundamental  temporal  operators  have  been  devised  for  express¬ 
ing  properties.  Some  of  these  are  described  as  follows. 

15.1.1  Always:  DP 

The  most  basic  safety  property,  that  nothing  bad  ever  happens,  is  catered 
for  by  the  temporal  operator  DP  (pronounced  “box  P”)  which  asserts  that 
the  property  P  is  true  in  every  state  into  which  the  process  may  evolve. 
Formally, 

E  \=  DP  if,  and  only  if, 

F  \=  P  for  all  F  such  that  E  —t  F  for  some  w  G  A*. 

This  is  similar  to  the  action  “box”  operator  [a]P  except  that  the  transitions 
involve  arbitrary  strings  w  £  A*  rather  than  a  single  action  a  6  A. 

We  can  view  this  property  as  an  infinite  conjunction;  the  property  asserts 
that  P  is  true  after  any  number  of  transitions: 

P  A  HP  A  [— ][— ]P  A  HHHP  A  ... 

That  is  to  say,  P  is  true  at  the  start;  and  after  any  single  transition;  and 
after  any  two  transitions;  and  after  any  three  transitions;  and  . . ..  Another 
way  to  view  this  is  as  a  recursive  property:  the  above  infinite  conjunction 
expresses  a  property  X  which  satisfies  the  recursive  equation 

X  =  P  A  [~]X 

which  describes  a  property  expressing  the  fact  that  P  is  true,  and  no  matter 
what  transition  happens  the  property  defined  by  X  must  hold  (that  is,  P 
is  true,  and  no  matter  what  transition  happens  the  property  defined  by  X 
must  hold  (that  is,  ...)).  Note  that  every  one  of  an  infinite  number  of 
properties  must  be  true  in  order  to  satisfy  DP. 

Deadlock-freedom  is  an  example  of  a  property  which  can  be  defined  with 
this  operator:  being  free  of  deadlocks  means  that  the  property  of  being  able 
to  perform  some  action,  ie  (— (true,  is  true  in  every  state  into  which  the 
process  may  evolve: 


Deadlock-free  =  □(— )true. 
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Example  15. 1J _ 

Consider  the  two  clock  processes  Clock  and  Clock*  from  Example  11.11 
pictured  in  Figure  11.7  (page  298).  Recall  that  these  two  processes  could  not 
be  distinguished  by  any  formula  of  HML,  since  they  were  n-game  equivalent 
for  every  n  £  N,  despite  the  fact  that  they  are  not  bisimilar  (that  is,  they 
are  not  oo-game  equivalent).  What  distinguishes  Clock*  from  Clock  is  the 
possibility  that  it  may  evolve  into  a  deadlock-free  state: 

Clock*  |=  (tick}D{tick}true  but  Clock  (tick)D{tick) true. 


(^Example  15.2^) 

The  safety  property  for  our  railway  level  crossing  example  of  Section  14.3  is 
that  at  no  time  can  a  car  cross  at  the  same  time  as  a  train.  This  is  expressed 
as 


□  (  [ccross]false  V  [tcross]false). 

That  is  to  say,  it  is  always  the  case  that  either  I  cannot  do  a  ccross  action 
(a  car  cannot  cross)  or  I  cannot  do  a  tcross  action  (a  train  cannot  cross). 


15.1.2  Possibly:  0  P 

If  we  wish  to  express  the  possibility  that  something  bad  may  happen, 
we  can  use  another  standard  temporal  operator,  0 P  (pronounced  “dia¬ 
mond  P”),  which  asserts  that  the  property  P  is  true  in  some  state  into 
which  the  process  may  evolve.  Formally, 

E  \=  OP  if,  and  only  if, 

F  |=  P  for  some  F  such  that  E  F  for  some  w  £  A*. 

This  is  similar  to  the  action  “diamond”  operator  ( a)P  except  that  the  tran¬ 
sitions  involve  arbitrary  strings  w  £  A*  rather  than  a  single  action  a  £  A. 

We  can  view  this  property  as  an  infinite  disjunction;  the  property  asserts 
that  P  is  true  after  some  number  of  transitions: 

P  V  (-)P  v  HHP  V  (  •)(••)(  }P  V  ... 

That  is  to  say,  P  is  true:  either  at  the  start;  or  after  some  single  transition; 
or  after  some  two  transitions;  or  after  some  three  transitions;  or  . . ..  Another 
way  to  view  this  is  as  a  recursive  property:  the  above  infinite  disjunction 
expresses  a  property  X  which  satisfies  the  recursive  equation 

X  =  P  V  (-)X 
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which  describes  a  property  expressing  the  fact  that  either  P  is  true,  or  there 
is  some  transition  which  may  happen  after  which  the  property  defined  by 
X  will  hold  (that  is,  either  P  is  true,  or  there  is  some  transition  which  may 
happen  after  which  the  property  defined  by  X  will  hold  (that  is,  ...))•  Note 
that  some  one  of  an  infinite  number  of  properties  must  be  true  in  order  to 
satisfy  this  OP. 

The  property  of  being  deadlockable,  ie  the  opposite  of  Deadlock-freedom, 
is  an  example  of  a  property  which  can  be  defined  with  this  operator:  being 
deadlockable  means  that  the  property  of  not  being  able  to  perform  any 
action,  ie  [—(false,  is  true  in  some  state  into  which  the  process  may  evolve: 

Deadlockable  =  0[ — (false. 

The  properties  OP  and  [UP  are  related  in  the  same  way  that  ( a)P  and 
[a]P  are  related,  in  that  each  is  used  to  express  the  negation  of  the  other: 

^OP  =  n-iP  and  -OP  =  0  ~^P 
These  operations  are  thus  inter-definable: 

•  OP  =  ^fOP:  P  is  true  in  some  reachable  state  if,  and  only  if, 

it  is  not  true  that  P  is  false  in  every  reachable  state; 

•  DP  =  -i0 ~'P-  P  is  true  in  every  reachable  state  if,  and  only  if, 

it  is  not  true  that  P  is  false  in  some  reachable  state. 

(^Exercise  15.2)  (Solution  on  page  488) _ 

Use  the  above  relationships  between  □  and  0  to  show  that 

Deadlock-free  =  ^Deadlockable 
where  Deadlock-free  =  □{— (true  and  Deadlockable  =  0[ — (false. 


15.1.3  Until:  PXJQ 

It  is  often  desirable  to  express  that  some  property  remains  true  until  some 
other  property  becomes  true,  and  that  this  latter  property  eventually  does 
at  some  time  become  true.  For  example,  we  might  wish  to  assert  that  when 
we  send  a  document  to  a  printer,  the  document  will  remain  on  the  printer 
queue  until  it  is  scheduled  to  be  printed,  and  it  will  eventually  be  printed. 
This  type  of  property  is  expressed  by  the  temporal  operator  P  U  Q  which 
asserts  two  things:  that  a  particular  property  Q  will  eventually  be  true;  and 
that  until  that  time  the  property  P  will  remain  true.  Formally: 

E  \=  PU  Q  if,  and  only  if, 

if  E  =  E0  Pj  ^  E2  ^  ^  En  -ft 

or  E  =  E0^Et^E2^E2^--- 

then  3k  such  that  Ek\=  Q  and  Et  \=  P  for  all  i  <  k. 
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Note  that  PUQ  is  true  if  Q  initially  holds;  and  that  P  can  remain  true 
when  Q  eventually  holds  but  doesn’t  have  to. 

We  can  view  the  property  PUQ  as  a  property  X  which  satisfies  the 
recursive  equation 

X  —  Q  V  (P  A  (-)true  A  [~]X) 

which  describes  a  property  expressing  the  fact  that:  either  Q  is  true;  or  P 
is  true,  and  it  is  possible  to  do  something,  and  no  matter  what  you  do  the 
property  defined  by  X  must  hold  (that  is:  either  Q  is  true;  or  P  is  true, 
and  it  is  possible  to  do  something,  and  no  matter  what  you  do  the  property 
defined  by  X  must  hold  (that  is:  . . .)).  Again  note  that  some  one  of  an 
infinite  number  of  properties  must  be  true  in  order  to  satisfy  PUQ. 

(^Exercise  15.3^)  (Solution  on  page  488) 

The  generic  liveness  property  asserts  that  something  good  eventually  hap¬ 
pens.  Show  how  to  express  the  temporal  operator  Ev  P  (pronounced  “even¬ 
tually  P”)  using  the  above  standard  temporal  operators. 


(l5.2j  Recursive  Properties 

The  temporal  operators  considered  in  the  previous  section  could  all  be 
viewed  as  solutions  to  recursive  equations  over  the  language  HML  of  modal 
logic.  For  example,  we  noted  above  that  HP  expresses  a  property  X  which 
satisfies  the  recursive  equation 

X  =  P  A  [-]X. 

Thus  we  would  want  E  \=  HP  to  hold  if,  and  only  if,  the  following  is  true: 

E\=X  if,  and  only  if,  E  \=  P  A  [-]X. 

To  incorporate  this  idea  into  the  logic  HML,  we  need  to  introduce  variables 
such  as  X  into  the  language  of  properties.  However,  the  semantic  definition 
from  Section  13.2  gives  us  no  means  by  which  we  can  determine  if  E  \=  X. 
It  is  not  enough  to  assume  that  each  variable  X  is  defined  by  some  equation 
X  =  P  and  declare  that  E  \=  X  if,  and  only  if,  E  \=  P.  For  example,  if 
E  a.E  and  X  is  defined  by  X  =  (a)X,  we  would  only  be  able  to  infer 
that  E  \=  X  if,  and  only  if,  E  \=  X\  either  answer  -  E  \=  X  or  E\f  X  -  is 
consistent  with  this  observation. 

To  get  around  this  deficiency,  we  need  to  introduce  some  mechanism  to 
determine  which  states  satisfy  a  variable  property  like  X.  This  is  provided 
by  a  valuation  function 
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V  :  Variables  — »  V  (States) 


where  Variables  is  a  set  of  variables  (such  as  X  above),  and  States  is  the  set 
of  states  of  the  labelled  transition  system  in  which  we  are  interested.  Modal 
formulae  involving  variables  are  then  interpreted  with  respect  to  a  valuation 


function  as  follows: 

E  bv  true 

for  all  E. 

E  |=v  false 

for  no  E. 

E  f=vX 

if,  and  only  i: 

,  E  E  V(X). 

Pbv-P 

if,  and  only  i: 

,  E^P. 

E  bv  P  A  Q 

if,  and  only  i: 

,  E  bv  P  and  E  bv  Q- 

E  \=v  P  V  Q 

if,  and  only  i: 

,  E  bv  P  or  E  bv  Q- 

E  bv  (a)P 

if,  and  only  i: 

,  F  bv  P  for  some  F  such  that  sAf 

E  bv  [a]P 

if,  and  only  i: 

,  F  bv  P  for  all  F  such  that  E  -A  F. 

This  is  identical  to  the  original  definition  for  E  |=  P  but  for  the  extra  clause 
which  determines  when  a  state  E  satisfies  a  variable  property  X ;  this  case 
is  catered  for  by  the  valuation  function  V  which  is  now  attached  to  the 
satisfaction  relation  bv- 

In  a  similar  fashion  we  can  extend  the  global  semantic  definition  ||P|'|  to 
incorporate  the  valuation  function  as  follows. 


1 1  true|  |  v  =  States 
1 1  fa  lse|  |  v  =  0 
ITIIv  =  V(X) 
lb-P|lv  =  PL 
pAQ||v  =  ||P||V  n  IIQIIv 

PVQ||V  =  ||P||V  u  IIQIIv 

[|(a)P||v  =  {E  E  States  :  E  -A  E1  for  some  E1  E  ||P||V } 
||[a]P||v  =  {E  E  States  :  E  -A  E'  implies  E'  E  ||P||V  } 


Theorem  13.12  then  extends  directly  to  properties  with  variables  as  follows. 


(Theorem  1 5 . 3  j 

E  [=v  P  if,  and  only  if,  E  E  ||P||V 


(Exercise  15.T)  (Solution  on  page  489) 


Prove  Theorem  15.3 
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15.2.1  Solving  Recursive  Equations 


In  order  to  determine  if  a  state  E  satisfies  the  property  being  expressed  by 
a  recursive  equation  X  =  P,  where  P  is  an  HML  formula  possibly  involving 
the  variable  X,  we  need  to  somehow  “solve”  the  equation  X  =  P.  This 
equation  simply  declares  that  the  set  of  states  which  satisfy  the  property  X 
being  defined  is  precisely  the  set  of  states  which  satisfy  the  property  P;  in 
other  words,  we  need  to  equate  the  following  two  sets: 

IWIv-Mv 

To  solve  this  equation  we  need  to  find  a  valuation  V  which  makes  this  a 
valid  set  equation.  Since  ||X||V  =  V(X),  the  answer  we  seek  is  the  set  5 
which  such  a  valuation  V  assigns  to  the  variable  X.  That  is,  the  solution  is 
a  set  5  C  States  satisfying 

=  H-^llvpCrtS] 

where  V[X  i->  5]  denotes  the  valuation  V  updated  by  assigning  the  set  5  to 
the  variable  X: 


(v[X^S])(Y) 


5  if  Y  =  X 
V(Y)  if  YjfX. 


Consider  a  property  X  which  satisfies  the  equation 


X  =  (a)X. 

Informally,  this  equation  suggests  that  an  infinite  sequence  of  consecutive  a 
actions  can  be  performed: 

E  \=  X  o  E  A  E'  for  some  E'  such  that  E'  r  X 

<K>  E  A  E1  A  E"  for  some  E'  and  E"  such  that  E"  |=  X 
O  ■  •  • 

O  E  A  E1  A  E"  A  E1"  A  ■■■  for  some  E' ,  E",  E"1, . . . 

Let  5  C  States  be  the  set  of  such  states: 

5  =  {  E  e  States  :  E  A  ■  A  ■  A  ■  ■  •  •  }. 

Then 

||{a)X||V[XrtS]  =  {.E  ^States  :  E  A  E1  for  some  E'  e  S }  =  5. 
Thus,  as  intended,  5  satisfies  the  equation  5  =  |!{a)X|jVjXrtiS]. 

One  problem  that  we  have  is  that  an  arbitrary  recursive  equation  needn’t 
necessarily  have  a  solution.  For  example,  if  we  take  the  recursive  equation 
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X  = 

then  given  any  valuation  V, 

||X||V  =  V(X) 

#  vffl  =  re  =  ure- 

Another  problem  is  that  an  equation  may  be  satisfied  by  many  different 
solutions,  as  illustrated  in  the  following. 

(^Exercise  15.5^)  (Solution  on  page  489) 

Show  that  the  set  5  =  0  satisfies  the  equation  5  =  ||(a)X||V[X^Sj  from 
Example  15.4. 


However,  we  will  show  here  that  any  recursive  equations  which  does  not 
involve  negation  has  a  solution,  and  moreover  we  will  show  how  to  solve  it 
to  obtain  the  intended  solution. 

15.2.2  Fixed  Point  Solutions 

Let  /  :  V  (States)  — >  V  (States)  be  defined  by 

KS)  =  }\P\\v[x„Sy 

Then  a  solution  to  the  recursive  equation  X  =  P  is  merely  a  fixed  point  of 
this  function:  a  set  S  C  States  such  that  5  =  f(S).  By  the  Knaster- Tarski 
Theorem  (Theorem  6.18,  page  174),  this  function  is  guaranteed  to  have  a 
fixed  point  -  in  fact  both  greatest  and  least  fixed  points  -  so  long  as  the 
function  is  monotonic.  That  this  function  is  monotonic  is  an  immediate 
corollary  of  the  following  result. 

(^Theorem  1 5 . 5y _ 

Let  P  be  an  HML  formula  which  does  not  involve  negation,  and  let  V  and 
W  be  valuations  such  that  V(X)  C  W(X)  for  all  X.  Then  ||P||V  C  ||P||W. 


Proof:  By  induction  -  and  arguing  by  cases  -  on  the  structure  of  P. 

P  =  true:  |jtrue||v  =  States  =  ||true|jw. 

P  =  false:  ||false||v  =  0  =  |jfalse||w. 

P  =  X :  ||X||V  =  V(X)  C  W(X )  =  ||X||W. 

P  =  Q  i  A  Q2:  ||Qi  A  Q2|Iv  “  HQilIv  n  1 1 Q2 1 1  v 
C  HQillw  n  IIQallw 
=  ||QiAQ2||w 
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P  —  Q  i  v  Q.?.'-  ||Qi  v  Q2|!v  —  HQilIv  u  1 1 Q2 1 1  v 
P  IIQlllw  U  1 1  ^2 1 1  w 
=  IIQi  v  <32|!w 

P  =  (a)Q:  ||{a)Q||v  =  {  E  E  States  :  E  A  E'  such  that  E1  E  ||<3||v  } 

C  {E  E  States  :  E  A  E'  such  that  E1  E  ||Q|jw} 

=  IIWQIiw 

P  =  [a]Q:  ||[a]Q||v  =  {E  E  States  :  E  -A  E1  implies  E1  E  ||Q||V } 

C  {E  E  States  :  E  —>  E1  implies  E1  E  ||Q||W} 

=  IIWQIIw  n 

The  Knaster- Tarski  Theorem  thus  tells  us  that  recursive  equations  which 
do  not  involve  negation  have  two  identifiable  solutions,  corresponding  to 
their  greatest  and  least  fixed  point  solutions.  This  begs  the  question,  when 
considering  a  recursively-defined  property,  as  to  which  solution  -  if  indeed 
either  of  them  -  represents  the  intended  solution.  That  is,  if  we  express  a 
property  as  a  recursive  equation  X  =  P,  the  set  of  states  which  satisfy  the 
property  we  have  in  mind  is  a  fixed  point  of  the  function  /(S)  =  ||-P|!V[x^s]' 
but  is  it  the  greatest  fixed  point,  or  the  least  fixed  point,  or  some  fixed  point 
in  between? 

This  question  will  be  explored  in  Section  15.4,  where  we  will  show  that 
the  answer  is  roughly: 

•  least  fixed  points  express  liveness  properties;  and 

•  greatest  fixed  points  express  safety  properties. 

Before  we  do  this,  though,  we  first  look  more  carefully  at  adding  these  two 
fixed  points  to  the  modal  logic  HML  without  negation.  The  resulting  logic 
with  fixed  points  is  called  the  modal  mu-calculus ,  and  is  one  of  the  most 
fundamental  logics  used  in  the  specification  of  computer  systems. 


Exercise  15.6J  (Solution  on  page  490) _ 

What  are  the  least  and  greatest  fixed  points  of  the  function 

/(S)  =  |j<a)X||v[x^ 

corresponding  to  the  property  considered  in  Example  15.4? 
Can  you  find  a  fixed  point  which  is  neither  least  nor  greatest? 
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||true||v  =  States 

||false||v  =  0 

IT  AQ||V  =  ||P||V  n  j|Q||v 
ITVQ||V  =  1 1  -P|  |  v  U  ||Q||V 

||(a)P||v  —  {E  E  States  :  E  -A  E'  for  some  E'  E  ||P||V} 

]][a]P||v  =  {E  E  States  :  E  -A  E'  implies  E'  E  ||P||V } 

ITIIv  =  V(X) 

IImXPHv  =  f|  {  S  C  States  :  S  D  ||P|jV[xMs]  } 

II^TIv  =  L){SC  States  :  S  C  ||-P|lvpc^(|j } 


Figure  15.1:  Global  semantics  of  the  modal  mu-calculus. 


(15.3)  The  Modal  Mu-Calculus 

The  syntax  of  the  modal  mu-calculus  consists  of  the  logic  HML  -  minus 
negation  -  extended  with  variables  and  constructs  for  defining  both  great¬ 
est  and  least  fixed  points.  Formally,  it  is  presented  by  the  following  BNF 
equation: 


P,  Q  ::=  true  |  false  |  PAQ  |  PvQ  |  { a)P  |  [a]P 
|  X  |  /rX.P  |  vX.P 

The  symbols  fj,  and  v  are  the  characters  “mu”  and  “nu”  from  the  Greek  al¬ 
phabet  (from  which  the  name  “mu-calculus”  derives).  The  formula  /rX.P  is 
used  to  represent  the  least  fixed  point  of  the  equation  X  =  P  (or  more  cor¬ 
rectly,  of  the  function  f(S )  =  ||P|lv[x^s])  whereas  uX.P  is  used  to  represent 
its  greatest  fixed  point. 

An  inductive  definition  of  the  semantic  function  ||P||V  -  defining  which 
states  satisfy  the  property  P  with  respect  to  the  valuation  V  -  is  given 
in  Figure  15.1.  The  clauses  are  identical  to  those  presented  for  the  basic 
modal  logic  HML  in  Figure  13.2  with  the  inclusion  of  the  obvious  clause 
for  variables,  and  clauses  for  the  fixed  points  as  given  by  the  Knaster- Tarski 
Theorem  (Theorem  6.18,  page  174),  That  the  Knaster- Tarski  Theorem  ap¬ 
plies  follows  from  the  fact  that  the  function  f(S)  =  ||P||V[X^S]  is  monotonic. 
We  demonstrated  this  for  HML  in  Theorem  15.5,  but  we  need  to  extend  this 
result  for  the  larger  logic. 
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(Theorem  15. T) 

Let  P  be  a  modal  mu-calculus  formula  which  does  not  involve  negation, 
and  let  V  and  W  be  valuations  such  that  V(X )  C  W(X )  for  all  X.  Then 

ITIIvSJTIIw- 


Proof:  By  induction  -  and  arguing  by  cases  -  on  the  structure  of  P.  All 
of  the  cases  have  been  catered  for  in  the  proof  of  Theorem  15.5  -  and  carry 
over  directly  to  the  present  setting  -  apart  from  the  cases  of  variables  and 
fixed  point  formulae,  which  we  handle  here. 


P  =  X :  ||X||V  =  V(X)  C  W(X)  =  ||X||W 


p_ 

—  uX.Q:  E  G 

WpX-QWv  ** 

E  E  S  whenever 

1  Ql  v[XrtS]  —  S 

E  E  S  whenever 

1  \Q\  W[XrtS]  —  P 

<^> 

E  E  |!/rX.Q|  w 

p_ 

=  vX.Q:  Ef 

II^.QIIv  ^ 

E  E  S  for  some  5  where  S  C  ||Q| 

E  E  S  for  some  5  where  5  C  ||Q| 

<^> 

E  E  \\vX.Q\\w 

VpHS] 

WpCrtS] 

□ 


(^Definition  1 5 . 6y _ 

A  direct  definition  of  when  a  state  E  satisfies  a  property  P  of  the  modal  mu- 
calculus  with  respect  to  a  valuation  V  for  interpreting  free  variables  which 
appear  in  P  is  as  follows: 


E 

Nv 

true 

for  all 

E. 

E 

Nv 

false 

for  no 

E. 

E 

l=v 

PAQ 

if, 

and 

only 

if, 

E 

l=v 

P  and 

E\=VQ. 

E 

Nv 

PvQ 

if, 

and 

only 

if, 

E 

l=v 

P  or  E  |=v  Q- 

E 

Nv 

(a)P 

if, 

and 

only 

if, 

F 

l=v 

P  for  some  F  such  that  E  A  F 

E 

Nv 

[a]P 

if, 

and 

only 

if, 

F 

l=v 

P  for  all  F  such  that  E  A  F. 

E 

Nv 

X 

if, 

and 

only 

if, 

E 

£  VPO- 

E 

l=v 

pX.P 

if, 

and 

only 

if, 

VS  C 

States  : 

if  E  ^  5  then 

3  F  S 

such  that  F  |=V[xrts]  P 

E 

t=v 

uX.P 

if, 

and 

only 

if, 

35  C 

States  : 

E  E  S  and 

vfes-.f  t=v[xrtS]  P 
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We  leave  it  as  an  exercise  ( Exercise  5,  page  402)  to  prove  (by  induction  on 
the  structure  of  P)  that  E  |=v  P  if,  and  only  if,  E  E  ||P||V. 

Leaving  negation  out  of  the  logic  is  not  a  real  restriction,  as  the  result 
from  Section  13.3  that  negation  is  definable  in  the  modal  logic  HML  extends 
to  the  whole  of  the  modal  mu-calculus.  This  is  justified  by  the  following. 

(^Exercise  15.7^)  (Solution  on  page  490) _ 

The  negation  of  a  modal  mu-calculus  formula  can  be  inductively  defined  as 
follows: 


neg(true) 

=  false 

neg«a)P)  = 

[a]neg(P) 

neg(false) 

=  true 

neg([a]P)  = 

(a)neg(P) 

neg(P  A  Q) 

=  neg(P)  V  neg(Q) 

neg  (pX.P)  = 

i'X.neg(P) 

neg (P  V  Q) 

=  neg(P)  A  neg(Q) 

neg(i'X.P)  = 

pX.neg(P) 

neg(-X') 

=  X 

Prove  that  E  f=y  neg(P)  if,  and  only  if,  E\£yP,  where  V(X)  =  V(X). 


(l5.4)  Least  versus  Greatest  Fixed  Points 

We  now  understand  how  to  interpret  recursive  logical  properties  as  fixed 
points  of  particular  functions  between  sets  of  states.  However,  we  are  left 
with  the  problem  of  understanding  why  the  property  we  intend  is  expressed 
by  either  the  greatest  or  the  least  fixed  point  of  this  function,  as  well  as 
the  problem  of  knowing  which.  To  solve  this,  we  shall  explore  how  such  a 
recursive  property  can  be  understood  by  “unrolling”  it. 

Given  a  property  X  defined  by  a  recursive  equation  X  =  P,  we  can 
unroll  the  equation  by  replacing  each  occurrence  of  X  on  the  right-hand- 
side  by  P  itself.  Clearly  this  will  not  change  the  meaning  of  the  property 
being  defined  by  the  recursive  equation. 


Example  15. 7j 

Suppose  we  wish  to  express  the  property  that  an  infinite  sequence  of  con¬ 
secutive  a  actions  can  happen.  That  is,  denoting  this  property  by  X,  we 
would  like  the  following  to  be  the  case: 

E  \=  X  if,  and  only  if,  fiA-A-A---- 
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This  property  is  expressed  by  the  recursive  equation 
X  =  (a)X 

which  we  can  repeatedly  unroll  as  follows: 

X  =  (a)X 
=  MMX 
=  MMM* 

=  MMMM* 

=  M  MMM  (“>••• 


Suppose  we  wish  to  express  the  property  that  an  a  action  must  eventually 
occur.  This  property  is  expressed  by  the  recursive  equation 


X  =  {— )true  A  [— a]X. 

That  is:  some  action  is  possible;  and  if  anything  other  than  an  a  action 
occurs,  then  an  a  action  must  eventually  occur  in  the  resulting  state.  We 
can  repeatedly  unroll  this  recursive  equation  as  follows: 

X  =  {— )true  A  [— a]X 

=  {— )true  A  [— a]({— )true  A  [— a]x) 

=  {— )true  A  [— a]^{— )true  A  [— a]((— )true  A  [— a]X)) 

=  {— )true  A  [— a]({— )true  A  [— a]((— )true  A  ;[ — a] (■  -  -))) 

15.4.1  Approximating  Fixed  Points 

By  repeatedly  unrolling  a  recursive  equation,  we  seem  to  eliminate  the  vari¬ 
able  from  the  formula.  Of  course  we  would  have  to  unroll  the  equation  in¬ 
finitely  often  in  order  to  get  rid  of  the  variable  altogether.  However,  we  don’t 
have  any  means  for  determining  whether  or  not  an  infinitely-long  property 
is  satisfied.  We  can,  however,  define  better  and  better  approximations  for 
such  properties,  by  replacing  the  variable  in  the  rolled-out  formula  by  either 
false  or  true.  To  this  end,  we  can  define  the  nth  mu-  and  nu-approximants 
as  follows. 


Given  a  recursive  equation  X  =  P,  the  nth  mu-approximant  pPX.P  and 
the  nth  nu-approximant  vnX.P  are  defined  inductively  as  follows: 
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fi°X.P  =  false  v°X.P  =  true 

Hn+1X.P  =  P[X  h>  nnX.P]  i P+1X.P  =  P[X  ^  vnX.P] 

These  definitions  extend  to  all  ordinal  numbers  (see  Section  12.5.1),  with  the 
following  definitions  for  the  approximants  corresponding  to  a  limit  ordinal 
A: 

HxX.P  =  \/  iSX.P  vxX.P  =  f\  vaX.P 

a< A  a<A 


(^Example  15. T) 

Recall  the  property  from  Example  15.7  that  an  infinite  sequence  of  consec¬ 
utive  a  actions  can  happen: 


X  =  (a)X. 

Its  mu-approximants  <!>„  and  nu-approximants  are  as  follows: 


=  false 
<!>!  =  (a)false 
$2  =  (a)  (a)  false 
$3  =  {a){a){a)false 


*0  =  true 

'J'i  =  {a)true 
T2  =  (a)  (a)true 
=  {a){a){a)true 


Clearly  none  of  the  mu-approximants  <i>„  can  be  satisfied  by  any  state.  How¬ 
ever,  every  one  of  the  nu-approximants  must  be  satisfied  in  order  for  our 
intended  property  to  be  satisfied. 

This  is  suggestive  of  a  safety  property:  checking  that  something  bad 
never  happens  (in  this  case,  that  an  a  action  is  impossible)  amounts  to 
checking  the  validity  of  every  unrolling  of  the  formula,  starting  from  true. 


(^Example  15.1(T) 

Recall  the  property  from  Example  15.7  that  an  a  action  must  eventually 
occur: 

X  =  {—)true  A  [— a]X. 


Its  mu-approximants  §n  and  nu-approximants  are  as  follows: 
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$0  =  false 

<i>i  =  {— )true  A 

$2  =  {— )true  A 

$3  =  {— )true  A 

*o  =  true 
Sti  =  (-)true  A 
T2  =  (-)true  A 
T3  =  {— }true  A 


With  a  little  thought,  it  is  apparent  that  one  of  the  mu-approximants  <f>„ 
must  be  satisfied  in  order  for  our  intended  property  to  be  satisfied.  However, 
every  one  of  the  nu-approximants  is  satisfied,  for  example,  by  a  process  which 
runs  forever  without  ever  doing  an  a  action. 

This  is  indicative  of  a  liveness  property,  checking  that  something  good 
eventually  happens  (in  this  case,  that  an  a  action  occurs)  amounts  to  check¬ 
ing  the  validity  of  some  unrolling  of  the  formula,  starting  from  false. 


[— ajtrue 

[— a]  ({— )true  A  [— ajtrue) 

[— a]  ^{— )true  A  [— a]({— )true  A  [—ajtrue)) 


[— a]  false 

[— a]({— )true  A  [— ajfalse) 

[— a]^{— )true  A  [— a]({— )true  A  [— a] false)) 


In  the  first  of  the  above  two  examples,  the  property  which  we  wished 
to  express  was  interpreted  as  the  conjunction  of  all  of  the  nu-approximants 
(the  unrollings  starting  from  true);  while  in  the  second  of  the  two  examples, 
the  property  of  interest  was  interpreted  as  the  disjunction  of  all  of  the  mu- 
approximants  (the  unrollings  starting  from  false).  In  what  follows,  we  shall 
see  that  the  first  corresponds  to  the  greatest  fixed  point  interpretation  of 
the  recursive  property,  while  the  second  corresponds  to  the  least  fixed  point 
interpretation. 

In  Section  6.5  we  described  how  the  least  and  greatest  fixed  points  of 
a  monotonic  function  /  defined  on  the  powerset  of  a  given  set  5  could  be 
constructed,  by  repeatedly  applying  the  function  /  to  either  the  empty  set  0 
(for  the  least  fixed  point)  or  to  the  whole  set  5  (for  the  greatest  fixed  point); 
this  result  was  given  in  Theorem  6.19.  This  is  just  the  result  we  are  looking 
for  here,  as  the  nth  mu-  and  nu-approximants  correspond,  respectively,  to 
applying  the  relevant  function  n  times  either  to  the  empty  set  0  or  to  the 
whole  set  States.  These  facts  are  recorded  in  the  following. 
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(Theorem  15.1cP) 

fnm  =  W^X.PWm  and  /"(States)  =  \\vnX.P\\w,  where  f(S)  =  ||P||v[XrtS] 


Proof:  We  prove  only  the  first  result,  by  induction  on  n,  and  leave  the 
proof  of  the  second  as  an  exercise  (Exercise  6,  page  402). 

For  the  base  case  n  —  0,  we  have 

/°(0)  =  0  =  1 1  false|  |  v  =  \\u,°X.P\\y 

For  the  induction  step,  we  have  that 

r+im  =  arm 

=  /(||/r"X.P||v)  (by  induction) 

=  ITIIv[X«||^"X.P||v] 

=  \\P[X  /PX.P]||V 

=  ||m"+1XP||v  □ 


(^Example  15. lT) _ 

Consider  the  recursive  equation  for  DP,  the  property  that  P  holds  in  every 
reachable  state: 

X  =  P  A  [~]X. 

This  gives  rise  to  the  function 

/(S)  =  {E  6  |]P||V  :  E  ->  E1  implies  E1  eS}. 

Using  the  construction  from  Theorem  6.19  (page  175)  starting  from  the 
empty  set  0,  we  discover  that 

m  =  0 

demonstrating  that  the  least  fixed  point  is  the  empty  set.  This  certainly  does 
not  correspond  to  the  property  DP.  However,  starting  from  the  universal 


set  States,  we  get 

that 

/“(States) 

=  States 

f1 (States) 

=  ITIIv 

/“(States) 

=  {Ee\ 

Tllv 

:  E  -»  E'  implies  E1  e  |jP||v  } 

/“(States) 

=  {-ee| 

Tllv 

:  E  — >  E1  or  E  — S>— S>  E1 

implies  E'  e  ||P||V  } 


/"(States)  =  states  in  which  P  is  true  throughout 
the  duration  of  the  first  n  transitions. 
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This  sequence  is  approaching  the  set  5  of  sets  in  which  P  is  true  in  every 
reachable  state,  which  is  the  desired  solution  to  our  recursive  equation  and 
easily  seen  to  be  a  fixed  point  of  the  function  /. 

(^Example  15.12^) _ 

Consider  the  recursive  equation  expressing  that  a  process  is  deadlockable: 

X  =  [— Jfalse  V  (-)X 

This  gives  rise  to  the  function 

f(S)  =  {  E  e  States  :  E  -fit  or  E  -»  E'  with  E'  eS}. 

Using  the  construction  from  Theorem  6.19  starting  from  the  universal  set 
States,  we  discover  that 

/(States)  =  States 

demonstrating  that  the  greatest  fixed  point  is  the  set  of  all  states.  This 
certainly  does  not  correspond  to  the  property  that  a  process  is  deadlockable. 
However,  starting  from  the  empty  set  0,  we  get  that 

/°(0)  =  0 

/1(0)  =  {Ee  States  : 

/2(0)  =  {  E  e  States  :  E  -fit  or  E  -»  E'  -fit  } 

/3(0)  =  {Be  States  :  E -fit  or  E  ^t  E'  -fit 

or  E  — »  E'  — »  E"  -fit  } 

/"(0)  =  states  which  can  deadlock  within  the  first  n  transitions. 

This  sequence  is  approaching  the  set  5  of  states  that  can  deadlock,  which  is 
the  desired  solution  to  our  recursive  equation  and  easily  seen  to  be  a  fixed 
point  of  the  function  /. 


(l5.5~)  Expressing  Standard  Temporal  Operators 

The  intuition  which  you  should  have  drawn  from  above  is  the  following. 

•  Greatest  fixed  point  properties  are  those  for  which  you  need  to  unroll 
the  underlying  recursive  equation  top-down  (from  true,  or  the  full  set  of 
states)  an  infinite  number  of  times  in  order  to  verify  that  the  property 
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is  always  true;  if  the  property  fails  for  some  finite  unrolling,  then  the 
fixed  point  property  itself  fails. 

In  this  sense,  greatest  fixed  point  properties  are  representative  of  safety 
properties  which  assert  that  nothing  bad  ever  happens. 

•  Least  fixed  point  properties  are  those  for  which  you  need  to  unroll  the 
underlying  recursive  equation  bottom-up  (from  false,  or  the  empty  set 
of  states)  a  finite  number  of  times  in  order  to  verify  that  the  property 
is  eventually  true;  if  the  property  fails  for  every  finite  unrolling,  then 
the  fixed  point  property  itself  fails. 

In  this  sense,  least  fixed  point  properties  are  representative  of  liveness 
properties  which  assert  that  something  good  eventually  happens. 

We  now  consider  how  to  express  each  of  the  standard  temporal  operators 
introduced  in  Section  15.1  in  the  mu-calculus. 

15.5.1  Always:  \JP 

The  temporal  operator  DP,  expressing  that  the  property  P  is  true  in  every 
state  into  which  the  process  may  evolve,  is  defined  by  the  recursive  equation 

X  =  P  A  [~]X. 

In  order  to  establish  the  truth  of  DP,  the  recursive  equation  would  need  to 
be  unrolled  forever,  to  make  sure  nothing  goes  wrong.  As  such,  this  property 
is  expressed  by  the  greatest  fixed  point  formula: 

□P  =  vX.P  A  [~]X. 

15.5.2  Possibly:  0  P 

The  temporal  operator  OP,  expressing  that  the  property  P  is  true  in  some 
state  into  which  the  process  may  evolve,  is  defined  by  the  recursive  equation 

X  =  P  V  (-)X. 

In  order  to  establish  the  truth  of  OP,  the  recursive  equation  would  need  to 
be  unrolled  only  until  the  property  can  be  verified  -  that  is,  only  until  the 
property  P  becomes  true.  As  such,  this  property  is  expressed  by  the  least 
fixed  point  formula: 

OP  =  pX.P  V  (-)X. 

15.5.3  Until:  PXJQ 

The  temporal  operator  PUQ,  expressing  that  the  property  P  remains  true 
until  the  property  Q  becomes  true,  which  it  must  eventually  do,  is  defined 
by  the  recursive  equation 


Further  Fixed  Point  Properties  399 


X  =  Q  V  (P  A  (-)true  A  [~}X) 

In  order  to  establish  the  truth  of  PUQ,  the  recursive  equation  would  need 
to  be  unrolled  only  until  the  property  can  be  verified  -  that  is,  only  until 
the  property  Q  becomes  true  (verifying  along  the  way  that  P  remains  true). 
As  such,  this  property  is  expressed  by  the  least  fixed  point  formula: 

PUQ  =  tiX.Q  V  (P  A  (-)true  A  [-]X). 


(^Exercise  15.12)  (Solution  on  page  491) 

1.  DP  =  ^.PA[-]Z  means  P  holds  in  every  state. 

What  does  pZ.P  /\[—]Z  mean? 

2.  OP  =  l- lZ.PM  (-)Z  means  P  holds  in  some  (reachable)  state. 
What  does  uZ.Pv(—)Z  mean? 

3.  PUQ  =  pZ.Q  V  (P  A  {—(true  A  [—\Z^  means  Q  will  become  true, 
and  until  then  P  will  remain  true. 

What  does  uZ.Q  V  (P  A  {— )true  A  \—\Z^  mean? 


(l5.6)  Further  Fixed  Point  Properties 

In  this  section  we  look  at  a  collection  of  properties  that  can  be  expressed  in 
the  mu-calculus. 

There  is  an  a"  path. 

By  this,  we  mean  that  we  can  do  an  infinite  number  of  consecutive  a  tran¬ 
sitions  starting  from  the  state  in  question. 

If  we  let  X  represent  this  property,  then  X  satisfies  the  recursive  equation 

X  =  {a)X  (It  is  possible  to  do  an  a  transition  and 

go  to  a  state  in  which  the  property  holds.) 

As  we  are  clearly  wanting  to  unroll  this  fixed  point  equation  infinitely  often, 
to  verify  that  the  property  holds  forever,  we  are  in  this  case  interested  in 
the  greatest  fixed  point  solution: 


vX.{a)X. 
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There  is  no  a"  path. 

This  is  the  negation  of  the  previous  property,  and  the  most  straightforward 
way  to  find  a  mu-calculus  formula  which  expresses  it  is  to  use  the  construc¬ 
tion  from  Exercise  15.7: 

neg  (vX.(a)X)  =  pX.{a]X. 

Unrolling  the  underlying  recursive  equation 

X  =  [a]X  (If  I  do  an  a  transition,  I  must  end  up 
in  a  state  in  which  the  property  holds.) 

suggests  an  exploration  of  each  au  path;  as  this  is  a  least  fixed  point  property, 
this  search  must  terminate,  namely  at  a  state  in  which  an  a  transition  is  not 
possible. 

P  holds  at  every  state  along  some  aP  path. 

This  is  a  simple  adaptation  of  the  first  property  above: 
vX.P  A  {a)X. 

P  holds  somewhere  along  some  a"  path. 

We  note  first  that  we  want  to  get  to  a  state  at  which  the  property  P  holds 
by  only  doing  a  transitions.  This  property  is  expressed  by  the  recursive 
equation 

X  —  P  V  (a}X  (Either  P  is  true,  or  it  is  possible  to  do  an  a  tran¬ 
sition 

and  end  up  m  a  state  in  which  the  property 
holds.) 

As  we  need  P  to  be  true  at  some  point,  this  is  a  least  fixed  point  property: 
pX.P  V  (a)X. 

This  is  not  the  end  of  the  story,  as  we  require  that  this  path  of  a  transi¬ 
tions  leading  up  to  the  state  in  which  P  is  true  be  the  start  of  an  au  path. 
In  other  words,  the  point  at  which  P  is  true  must  be  the  start  of  an  au  path 
-  which  is  the  first  property  we  considered  above:  uX.(a)X.  Hence,  the 
property  we  seek  is  as  follows: 

pX.(P  A  vX.(a)X)  V  (a)X. 

This  formula  is  fine;  however,  to  avoid  confusion  it  is  best  to  use  different 
variables  for  the  two  fixed  point  constructions: 


pX.{P  A  vY.(a)Y)  V  (a)X. 
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P  holds  at  every  state  along  every  aP  path. 

We  first  express  the  property  that  P  holds  along  every  (finite  or  infinite) 
path  of  a  transitions: 

vX.P  A  [a]X. 

As  a  greatest  fixed  point  property,  the  only  way  this  property  can  fail  is  if  we 
reach  a  state  after  some  number  of  a  transitions  in  which  P  does  not  hold.  If 
this  is  the  case,  then  we  want  to  ensure  that  we  are  not  on  an  au  path;  that 
is,  that  from  this  state  in  which  P  fails  to  hold  we  cannot  continue  along 
an  au  path  -  which  is  the  second  property  we  considered  above:  pX.[a\X. 
Hence  the  property  we  seek  is  as  follows: 

vX.(PV  nX.[a]X)  A  [a]X. 

Again  it  is  sensible  to  use  different  variables  for  the  two  fixed  point  con¬ 
structions: 

uX.{P  V  pY.[a]Y)  A  [a]X. 


Exercise  15.13J  (Solution  on  page  491) _ 

Give  mu-calculus  formulae  for  the  following  properties.  In  each  case  give  an 
intuitive  explanation  of  your  solution. 

1.  P  almost  always  holds  along  some  au  path. 

Note:  To  say  that  something  holds  almost  always  means  always  apart 
from  a  finite  number  of  times.  Thus,  this  property  says  that  P  holds 
everywhere  along  some  au  path  after  some  point  along  this  path. 

2.  P  holds  infinitely  often  along  some  au  path. 


(l5.7)  Additional  Exercises 

1.  Give  a  semantic  definition  for  the  weak  until  temporal  operator  PW  Q 
which  asserts  that  the  property  P  remains  true  until  the  property  Q 
becomes  true,  but  allows  that  the  property  Q  may  never  become  true 
(in  which  case  the  property  P  remains  true  for  as  long  as  the  process 
evolves). 

2.  We  noted  in  Section  13.2  that  we  can  express  the  property  that  the 
action  a  must  happen  as: 

P  =  (a)true  A  f\  [6]false 

b^a 
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which  says  that  a  may  happen  and  nothing  other  than  a  may  happen. 

What  is  the  difference  between  the  property  “eventually  a  must  hap¬ 
pen”  as  expressed  by  the  temporal  formula  Bv  P  (where  the  Eventually 
temporal  operator  was  defined  in  Exercise  15.3)  and  the  property  “a 
must  eventually  happen ”? 

(Hint:  exactly  one  of  these  properties  holds  for  the  process  6.0  +  a.a.0.) 

3.  Prove  Theorem  15.3. 

4.  Prove  that  if  X  is  not  a  free  variable  of  P  then  for  any  5  C  States, 

ll-P|lv[Xh->S]  =  Fllv 

5.  Prove  that  E  |=v  P  if,  and  only  if,  E  e  ||P||V,  where  E  ranges  over 
formulae  of  the  modal  mu-calculus. 

6.  Prove  the  second  part  of  Theorem  15.10,  that  /"(States)  =  [)i/"XP||v 
where  f(S)  =  ||P||v[x„s]- 

7.  Express  the  following  properties  in  the  modal  mu-calculus. 

(a)  P  holds  at  some  state  along  every  au  path. 

(b)  P  almost  always  holds  along  every  au  paths 

(c)  P  holds  infinitely  often  along  every  au  path. 

8.  Express  the  property  of  mutual  exclusion  in  the  modal  mu-calculus; 
that  is,  the  property  that  whenever  an  entry  action  occurs  (signifying 
that  a  process  has  entered  the  critical  region),  then  a  further  entry 
action  cannot  occur  until  an  exit  action  occurs. 

9.  The  extended  modality  ( a)*P  expresses  that  the  property  P  holds 
after  some  number  of  a  transitions,  while  the  extended  modality  [a]*P 
expresses  that  the  property  P  holds  after  any  number  of  a  transitions. 

Express  these  extended  modalities  as  mu-calculus  formulae. 

10.  Express  the  following  properties  in  the  modal  mu-calculus. 

(a)  In  some  run  the  action  a  does  not  happen. 

(b)  The  actions  a  and  b  happen  alternately  forever  (starting  with  the 
action  a),  with  any  number  of  occurrences  of  other  actions  before 
and  between  the  a  and  b  actions. 

(c)  In  any  run,  a  and  b  happen  infinitely  often. 

(d)  If  a  and  b  happen  infinitely  often,  then  P  is  true  infinitely  often. 

(e)  In  any  run,  P  is  true  at  least  twice. 

(f)  In  any  run,  P  is  true  exactly  twice. 

11.  Let  E  =f  a.E  +  a.F,  F  =f  b.G,  and  G  =f  a.G,  and  consider  the 
following  two  properties: 

4>!  =  pY.(vX.(a) true  A  [— ]x)  V  [— ]Y 
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<i>2  =  /iY.i/X.  (<a) true  A  [~}x)  V  [~]Y 

The  process  state  E  satisfies  $2  but  not  $1. 

Can  you  understand  and  explain  why  this  is  the  case? 

Can  you  work  out  what  these  two  properties  are  expressing? 


Solutions  to  Exercises 


In  the  book  of  life,  the  answers  aren’t  in  the  back 

-  Charlie  Brown. 


Chapter  1 

Exercise  1.1  (page  20) 

1,  2,  4,  6,  7  and  8  are  statements,  while  3  and  5  are  not. 

Note  that  7  refers  to  some  unspecified  utterance  by  Felix,  upon  which 
the  truth  of  this  statement  depends.  Statement  8  however  is  more  compli¬ 
cated:  if  the  sentence  it  refers  to  is  itself,  then  there  is  no  consistent  way  to 
determine  its  truth  value:  if  what  it  says  is  true,  then  it  must  be  false;  and 
if  what  it  says  is  false,  then  it  must  be  true. 

Exercise  1.2  (page  20) 

1.  This  is  not  a  valid  deduction.  There  may  be  some  other  reason  that 
everyone  is  leaving  the  building,  e.g.,  it  may  be  closing  for  the  night. 

2.  Ideally  this  would  be  true,  but  it  is  not  a  valid  deduction.  It  would 
be  valid  to  deduce  that  everyone  must  leave  the  building;  however, 
saying  someone  (or  something)  must  behave  in  a  particular  fashion 
does  not  make  it  so;  for  example,  some  people  may  ignore  fire  alarms, 
considering  fire  alarm  testing  to  be  a  nuisance. 

3.  This  is  not  a  valid  deduction.  The  conclusion  is  no  doubt  true,  as 
there  is  surely  a  rule  that  states  that  a  train  must  wait  at  a  red  signal; 
but  this  rule  is  not  provided  in  the  argument.  It  might  be  that  the 
rules  for  the  railway  in  question  do  not  state  that  trains  must  wait  at 
a  red  light. 

4.  This  is  not  a  valid  deduction.  The  conclusion  is  true,  but  not  for  the 
reasons  provided  in  the  two  premises. 

5.  This  is  not  a  valid  deduction.  The  rook  that  has  already  moved  might 
not  be  the  one  involved  in  the  castling. 
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Undergraduate  Topics  in  Computer  Science, 

DOI  10.1007/978-1-84800-322-4,  ©  Springer- Verlag  London  2013 


406  Temporal  Properties 


6.  A  judgement  as  to  the  validity  of  this  deduction  cannot  be  made  on 
purely  logical  grounds,  due  to  the  ambiguity  of  the  language  of  the 
city  by-law.  Specifically,  what  is  the  status  of  the  conjunction  “and" 
in  the  by-law?  As  Charles  does  not  keep  any  cats,  and  certainly  no 
more  than  three,  it  could  be  argued  that  it  is  within  his  rights  to  keep 
five  (or  even  fifty)  dogs  on  his  property.  Even  worse,  the  “more  than” 
in  “more  than  three  dogs  and  three  cats”  might  only  apply  to  dogs 
and  not  cats,  thereby  making  the  keeping  of,  say,  five  dogs  and  any 
number  of  cats  allowed  except  when  there  are  exactly  three  cats. 

Exercise  1.3  (page  21) 

1.  This  is  a  valid  deduction. 

2.  This  is  a  valid  deduction.  The  conclusion  that  Epimenides  is  a  liar 
follows  from  the  premises,  as  a  truth-telling  Cretan  cannot  say  that  all 
Cretans  (including  himself)  are  liars. 

3.  This  is  not  a  valid  deduction.  It  may  be  that  all  Cretans  are  liars;  or 
it  may  be  that  Epimenides  is  the  only  liar.  Also,  from  the  previous 
deduction  we  already  know  that  Epimenides  is  a  liar  based  on  the 
given  premises,  so  the  conclusion  -  being  precisely  what  Epimenides 
claims,  must  be  false. 

4.  This  is  a  valid  deduction.  We  know  that  the  premises  imply  that 
Epimenides  is  a  liar,  so  his  claim  that  all  Cretans  are  liars  must  be 
false. 

5.  This  is  not  a  valid  deduction.  Aristotle  may  be  a  liar. 

Exercise  1.4  (page  23) 

1.  “The  earth  does  not  revolve  around  the  sun.” 

2.  “I  have  at  least  one  daughter.” 

3.  2  +  2  >4. 

Exercise  1.5  (page  24) 

1.  This  is  true,  as  the  second  disjunct  is  true  (although  the  first  disjunct 
is  false). 

2.  This  is  false,  as  neither  disjunct  is  true. 

3.  This  is  true,  as  both  disjuncts  are  true. 

Exercise  1.6  (page  24) 

1.  Inclusive.  This  statement  implies  that  Joel  could  not  have  come  in 
last  place  if  he  beat  both  Felix  and  Oskar,  so  he  must  have  lost  to  one 
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of  them;  but  he  may  well  have  lost  to  both  of  them. 

2.  Exclusive.  It  is  impossible  for  a  light  to  be  both  on  and  off  at  the  same 
time. 

3.  Exclusive.  The  server  no  doubt  intends  to  offer  the  guest  only  one  of 
the  beverages.  However,  if  the  guest  is  so  odd  as  to  ask  for  a  cup  of 
both  (either  one  cup  with  a  mix  of  coffee  and  tea;  or  two  cups,  one 
with  coffee  and  the  other  with  tea),  the  server  will  no  doubt  reluctantly 
oblige. 

Exercise  1.7  (page  25) 

1.  This  is  false,  as  only  the  second  conjunct  is  true  (the  first  conjunct  is 
false). 

2.  This  is  false,  as  neither  conjunct  is  true. 

3.  This  is  true,  as  both  conjuncts  are  true. 

Exercise  1.8  (page  26) 

1.  AmandaHappy  =>  JoelHappy. 

2.  JoelHappy  =4>  AmandaHappy. 

3.  AmandaHappy  =>  JoelHappy. 

Exercise  1.9  (page  26) 

It  may  well  be  true  that  barking  dogs  don’t  bite  (i.e. ,  Bark  =>  ^Bite),  but 
this  says  nothing  about  the  habits  of  dogs  that  don’t  bark;  they  may  bite, 
or  they  might  not. 


Exercise  1.10  (page  29) 

1.  p\q  =  ^(pAg). 

2  ■  Piq  =  -,(p  V  q). 

3.  q  <i  p  t>  r  =  (p  A  q)  V  (nyAr). 


Exercise  1.12  (page  31) 


1.  P=>Q  &  Q 


P  has  the  following  syntax  tree: 


/  \  /  \ 

P  Q  Q  P 


It  would  be  sensible  in  this  example  to  include  redundant  parentheses 
for  readability,  and  to  write  the  formula  as  (P  =>  Q)  o  (Q  =a  P) 
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2.  This  is  not  a  well-formed  formula. 

3.  (P  V  Q)  A  P  has  the  following  syntax  tree: 


Due  to  the  precedence  rules,  the  parentheses  are  not  redundant;  P  V 
Q  A  P  would  be  interpreted  as  P  V  (Q  A  P). 

4.  This  is  not  a  well-formed  formula. 

5.  PvQAP  o  PvQa(PvP)  has  the  following  syntax  tree: 


/  \ 

P  P 


In  this  case,  only  one  pair  of  parentheses  is  redundant;  however,  it 
would  be  sensible  to  avoid  confusion  by  including  all  of  the  redundant 
parentheses. 


Exercise  1.14  (page  33) 

This  example  hints  at  the  many  complicated  ways  that  English  (or  any 
natural  language)  can  be  used  to  express  simple  facts.  We  can  draw  the 
conclusion  that  Lewis  Carroll  is  after  by  making  clear  what  each  of  the 
above  assumptions  is  saying. 

Firstly,  we  introduce  propositional  variables  to  represent  the  different 
atomic  propositions  that  appear  in  the  argument. 


Love: 

“Amos  Judd 

Police: 

“Amos  Judd 

Sup: 

“Amos  Judd 

Long: 

“Amos  Judd 

Poet: 

“Amos  Judd 

Prison: 

“Amos  Judd 

Cousin: 

“Amos  Judd 

loves  cold  mutton.” 

is  a  policeman  on  this  beat.” 

sups  with  our  cook.” 

has  long  hair.” 

is  a  poet.” 

has  been  to  prison.” 

is  our  cook’s  cousin.” 
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We  wish  to  deduce,  formally  and  logically,  the  truth  of  the  atomic  proposi¬ 
tion  Love,  which  asserts  that  “Amos  Judd  loves  cold  mutton.”  Notice  that 
we  modelled  the  problem  by  instantiating  the  properties  of  all  men  to  apply 
only  to  Amos  Judd,  as  he  is  the  only  man  in  whom  we  have  any  interest. 

The  seven  assumptions  above  then  translate  into  the  following  proposi¬ 
tional  formulae: 

1.  Police  =>  Sup. 

2.  Long  =>  Poet. 

3.  -iPrison. 

4.  Cousin  =>  Love. 

5.  Poet  =>  Police. 

6.  Sup  =>  Cousin. 

7.  ^Prison  =>  Long. 

You  should  think  carefully  about  each  of  these  translations,  and  make  sure 
that  you  understand  why  they  are  correct.  Assumptions  5  and  6  are  partic¬ 
ularly  tricky.  For  example,  when  5  says  that  “None  but  policemen  on  this 
beat  are  poets,”  it  is  asserting  that  in  order  to  be  a  poet  you  must  be  a 
policeman  on  this  beat.  Thus,  if  Amos  Judd  is  a  poet  (Poet),  then  Amos 
Judd  must  be  a  policeman  on  this  beat  (Police):  Poet  =>  Police.  Also,  when 
7  says  that  “Men  with  short  hair  have  all  been  in  prison,”  it  is  asserting 
that  anyone  who  has  not  been  to  prison  must  have  long  hair;  thus  if  Amos 
Judd  has  not  been  to  prison  (^Prison),  then  Amos  Judd  must  have  long  hair 
(Long):  ^Prison  =>  Long. 

We  can  finally  work  out  the  logic,  step-by-step,  behind  the  claim  that 
“Amos  Judd  loves  cold  mutton”  (Love): 


^Prison 

(by 

3) 

Thus 

Long 

(by 

7, 

^Prison  =>  Long) 

Thus 

Poet 

(by 

2, 

Long  =>  Poet). 

Thus 

Police 

(by 

5, 

Poet  =>  Police). 

Thus 

Sup 

(by 

1, 

Police  =>  Sup). 

Thus 

Cousin 

(by 

6, 

Sup  =>  Cousin). 

Thus 

Love 

(by 

4, 

Cousin  =>  Love). 

The  last  line  is  the  conclusion  that  we  sought.  (Along  the  way,  we  also 
deduced  that  Amos  Judd  has  long  hair;  he  is  a  poet;  he  is  a  policeman  on 
this  beat;  he  sups  with  our  cook;  and  he  is  a  cousin  of  the  cook.) 
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Exercise  1.15  (page  34) 

The  first  clause  states  that  the  right  to  castle  with  a  particular  rook  (either 
the  left  rook  or  the  right  rook)  has  been  lost  if  either  the  king  or  the  rook 
in  question  has  already  moved: 

KingMoved  V  LeftRookMoved 
=>  ^RightToCastleLeft. 

KingMoved  V  RightRookMoved 
=>  ^RightToCastleRight. 

The  second  clause  states  that  the  player  may  not  castle  with  a  particular 
rook  if  the  right  to  do  so  has  been  lost,  or  if  there  is  a  piece  between  the 
king  and  the  rook  in  question,  or  if  the  square  on  which  the  king  stands,  or 
the  square  which  it  must  cross,  or  the  square  which  it  is  to  occupy  is  under 
attack: 

^RightToCastleLeft 

V  PieceBetweenLeft 

V  KingAttack 

V  LeftSquareAttack 

V  KingMoveLeftAttack 

=>  ^MayCastleLeft 

Exercise  1.16  (page  35) 

1.  We  need  to  express  the  property  that  the  piece  of  paper  held  by  each 
boy  has  exactly  one  of  the  other’s  names  on  it,  and  that  each  name  is 
written  on  a  piece  of  paper  held  by  exactly  one  other  boy. 

The  following  proposition  p  expresses  that  the  piece  of  paper  held  by 
each  boy  has  exactly  one  of  the  other’s  names  on  it: 

p  =  (FonJ  V  OonJ)  A  (^FonJ  V  ^OonJ) 

A  (JonF  V  OonF)  A  (-JonF  V  ^OonF) 

A  (JonO  V  FonO)  A  (^JonO  V  -nFonO). 

For  succinctness,  we  could  have  used  the  exclusive-or  connective: 

p  =  (FonJ  ©  OonJ)  A  (JonF  ©  OonF)  A  (JonO  ©  FonO). 

The  following  proposition  q  expresses  that  each  name  is  written  on  a 
piece  of  paper  held  by  exactly  one  other  boy: 


^RightToCastleRight 

V  PieceBetweenRight 

V  KingAttack 

V  RightSquareAttack 

V  KingMoveRightAttack 

=>  ^MayCastleRight 
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q  =  (JonF  V  JonO)  A  (^JonF  V  ^JonO) 

A  (FonJ  V  FonO)  A  (-iFonJ  V  ->FonO) 

A  (OonJ  V  OonF)  A  (^OonJ  V  ^OonF). 

Again  this  could  be  expressed  more  succinctly: 

p  =  (JonF  ©  JonO)  A  (FonJ  ©  FonO)  A  (OonJ  ©  OonF). 
The  formula  we  seek  is  then  p  A  q. 

2.  From  OonJ  we  can  deduce  ^FonJ  from  p,  from  which  we  can  deduce 
FonO  from  q,  from  which  we  can  deduce  ^JonO  from  p,  from  which  we 
can  deduce  JonF  from  q. 

In  summary,  we  have  “Oskar”  on  Joel’s  piece  of  paper,  “Joel”  on  Felix’s 
piece  of  paper,  and  “Felix”  on  Oskar’s  piece  of  paper. 

Exercise  1.19  (page  40) 

If  you  answer  this  question  quickly,  you  might  conclude  that  I  would  reject 
the  white  circle.  However,  this  would  be  wrong  if,  for  example,  I  had  the 
white  square  in  mind. 

In  fact,  you  cannot  conclude  that  I  will  reject  any  particular  symbol 
(though  you  can  conclude  that  I  will  reject  one  of  them,  you  just  cannot 
determine  which). 

Exercise  1.20  (page  40) 

Nine. 

The  point  of  this  old  joke  is  that  four  and  five  are  nine  irrespective  of 
the  premise  of  the  conditional  statement. 


Exercise  1.21  (page  42) 

Define  the  following  atomic  propositions. 

U  =  You  understand  implication. 

P  =  You  pass  the  exam. 

The  statement  translates  to  U  =>  P  which  has  the  following  truth  table: 


U  P 

F  F 
F  T 
T  F 
T  T 


U  =>  P 

T 

T 

F 

T 
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The  only  scenario  in  which  the  above  statement  can  be  considered  false  is 
if  U  is  true  and  P  is  false  -  that  is,  if  you  do  not  pass  the  exam  despite 
understanding  induction. 

Exercise  1.22  (page  44) 

Each  new  variable  doubles  the  number  of  combinations  of  truth  values. 
Thus,  a  truth  table  involving  four  propositional  variables  will  have  16  rows, 
and  one  involving  five  variables  will  have  32  rows.  In  general,  a  truth  table 
involving  n  propositional  variables  will  have  2"  rows. 

Truth  tables  grow  very  quickly  with  the  number  of  propositional  vari¬ 
ables.  Building  truth  tables  for  propositions  with  many  variables,  such  as 
in  the  Amos  Judd  example  (Exercise  1.14),  can  therefore  be  frustrating  or 
even  infeasible. 

Exercise  1.23  (page  44) 
l. 


P  Q 

F  F 
F  T 
T  F 
T  T 


2. 


T 

<3 

(P 

A 

Q) 

V 

P 

A 

— 1 

QV 

F 

F 

F 

F 

F 

T 

T 

F 

T 

T 

F 

F 

T 

F 

F 

T 

F 

T 

F 

F 

F 

T 

T 

F 

T 

F 

F 

F 

F 

T 

F 

T 

F 

T 

T 

T 

T 

T 

T 

F 

T 

F 

F 

T 

^  (P  O  ^  Q) 

T  F  F  T  F 

F  F  T  F  T 

F  T  T  T  F 

T  T  F  F  T 


3. 
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3.  (p  =>  -i p)  o  ^p  is  a  tautology: 


Z-' 

p 

(P 

— 1 

P) 

<^> 

— i 

P 

F 

F 

T 

T 

F 

T 

T 

F 

T 

T 

F 

F 

T 

T 

F 

T 

4.  (p  =>  g)  =>  p  is  neither  a  tautology  nor  a  contradiction: 


5-  P  =>(?=>  p)  is  a  tautology: 

(PI 

F  F 
F  T 
T  F 

It  t 

Exercise  1.26  (page  47) 

The  following  is  a  truth  table  for  these  three  propositions: 


The  formula  p  representing  the  original  program  code  is  not  equivalent  to 
the  formula  q  representing  the  first  optimisation,  as  there  are  interpretations 
of  the  atomic  propositions  which  give  rise  to  different  truth  values  for  p  and 
q,  highlighted  in  the  third  and  fifth  rows  of  the  above  truth  table. 

However,  the  formulae  p  and  q  are  equivalent,  as  the  truth  values  of 
these  formulae  are  the  same  under  all  interpretations,  and  hence  the  second 
optimisation  is  valid. 


P  =>  (g  =>  P) 

FT  F  T  F 
F  T  T  F  F 

T  T  F  T  T 

T  T  T  T  T 
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Exercise  1.27  (page  50) 

1.  p  A  (-i p  V  q) 

O  (p  A  -ip)  V  (p  A  q)  (Distributivity) 

AA  false  V  (pAq)  (Excluded  Middle) 

O  (p  A  g)  A  false  (Commutativity) 

-O  pAg  (Tautology) 

2.  ^(p  =>  g) 

O  n(np  v  g)  (Implication) 

O  ^^p  A  (De  Morgan) 

O  pAng  (Double  Negation) 

3.  p  =>  (gVr) 

O  np  V  (g  V  r)  (Implication) 

O  (np v  np)  V  (gVr)  (Idempotence) 

AA  (np  V  g)  V  (np  V  r)  (Associativity,  Commutativity) 
O  (p  =>  g)  V  (p  =>■  r)  (Implication) 

4.  p  =>  (q  Ar) 

O  np  V  (g  A  r)  (Implication) 

AA  (^p  V  g)  A  (np  V  r)  (Distributivity) 

O  (p=Ag)  A  (p=^r)  (Implication) 

5.  (pAg)  =>  r 

<=>  n(p  A  g)  V  r  (Implication) 

O  (npVng)  v  r  (De  Morgan) 

O  (npVng)  v  (rVr)  (Idempotence) 

AA  (np  V  r)  V  (-ig  V  r)  (Associativity,  Commutativity) 
O  (p  r)  V  (g  r)  (Implication) 

6.  (pVg)  =>  r 

O  n(p  v  g)  V  r  (Implication) 

<=>  (npAng)  v  r  (De  Morgan) 

O  (np  V  r)  A  (np  V  r)  (Distributivity) 

O  (p  =4>  r)  A  (g  =>  r)  (Implication) 
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Chapter  2 

Exercise  2.1  (page  59) 

1.  {1,  3,  5,  7}. 

2.  {Tuesday,  Thursday,  Friday,  Saturday}. 

3.  {Catherine  of  Aragon,  Anne  Boleyn,  Jane  Seymour, 

Anne  of  Cleves,  Catherine  Howard,  Catherine  Parr  }. 

4.  {  Sean  Connery,  George  Lazenby,  Roger  Moore, 

Timothy  Dalton,  Pierce  Brosnan,  Daniel  Craig}. 

Exercise  2.2  (page  60) 

1.  {2,  4,  6,  8,  10}. 

2.  {1,  2}. 

Exercise  2.3  (page  60) 

1,  3  and  5  are  true,  while  2  and  4  are  false. 

Exercise  2.4  (page  60) 

A  —  E  and  C  =  D. 

Exercise  2.5  (page  62) 

1  is  true,  while  2  and  3  are  false. 

Exercise  2.7  (page  65) 

If  R  e  R,  then  by  definition  of  R  we  would  have  R  ^  R,  which  cannot  be 
true.  Therefore  we  must  have  that  R  <£  R. 

This  is  no  longer  a  problem,  as  R  ^  R  now  means  that  either  R  £  A  or 
R  e  R;  since  we  know  that  R  ^  R,  this  simply  means  that  R  ^  A. 

Exercise  2.15  (page  69) 

The  Venn  diagram  is  depicted  in  Figure  15.2 

1.  AdC  =  {5,  7,  9}. 

2.  (A  n  B)  U  C  =  {  3,  5,  6,  7,  8,  9  }. 

3.  An(BuC)  =  {3,  5,  7,  9}. 

4.  (4uB)\C  =  {1,  3,  4}. 


5.  (AuB)nC  =  {6,  8}. 
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Figure  15.2:  Venn  diagram  for  Exercise  2.15. 


Exercise  2.16  (page  69) 

You  can  use  Venn  diagrams  to  verify  these  properties. 

1.  If  A  C  B,  then  A  U  B  =  B  and  A  n  B  =  A. 

2.  If  A  C  B,  then  B  CA. 

3.  2  =  A. 

4.  If  C  C  A  and  C  C  B,  then  C  C  A  n  B. 

5.  If  A  C  C  and  B  C  C,  then  A  U  B  C  C. 

Exercise  2.17  (page  71) 

Letting  D  =  Daniel,  E  =  Ella,  M  =  Mia,  R  =  Rhodri  and  Z  =  Zoe,  we  get 
V({D,  E,  M,  R,  Z}) 

=  {0, 

{D},  {E},  {M},  {R},  {Z}, 

{D,  E},  {  D,  M},  {  D,  R},  {  D,  Z}, 

{  E,  M},  {  E,  R},  {  E,  Z}, 

{M,  R},  {M,  Z},  {  R,  Z}, 

{  D,  E,  AT},  {D,  E,  R},  {  D,  E,  Z}, 

{  D,  M,  R},  {  D,  M,  Z},  {  D,  R,  Z}, 

{  E,  M,  R},  {  E,  M,  Z},  {  E,  R,  Z},  {  M,  R,  Z}, 

{  D,  E,  M,  R  },  {  D,  E,  M,  Z},  {  D,  E,  R,  Z}, 

{D,  M,  R,  Z},  {E,  M,  R,  Z}, 

{D,  E,  M,  R,  Z}}. 

More  specifically,  there  are  the  following  subsets: 
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•  one  subset  with  no  elements  (the  empty  set); 

•  five  singleton  subsets  (one  for  each  element  in  the  set); 

•  ten  subsets  with  two  elements  (one  for  each  pair); 

•  ten  subsets  with  three  elements  (one  for  each  pair  left  out); 

•  five  subsets  with  four  elements  (one  for  each  element  left  out);  and 

•  one  subset  with  five  elements  (the  whole  set  itself). 

Exercise  2.18  (page  71) 

1.  J4  =  'P(0)  =  {0}  contains  1  element. 

2.  B  =  V  (A)  =  {  0,  {0}  }  contains  2  elements. 

3.  C  =  V  ( B )  =  {  0,  {  0  },  {  {0}  },  {  0,  {0}  }  }  contains  4  elements. 

Exercise  2.19  (page  71) 

V{A)n<h  =  0  and  V  (A)  n  {  0 }  =  {0}. 

Exercise  2.20  (page  73) 

f)Vtn(A)  =  0  and  (JV^^A)  =  A. 

Note  that  the  union  of  infinitely-many  finite  sets  may  well  be  infinite, 
although  the  union  of  finitely-many  finite  sets  will  of  course  be  finite. 

Exercise  2.23  (page  75) 

(p,q)  +  (r,  s)  =  ( ps  +  qr,qs )  and  ( p,q)x(r,s )  =  (pr,qs). 

Exercise  2.24  (page  76) 

Consider  the  following  sets  of  people: 

Love  =  the  set  of  people  who  love  cold  mutton. 

Police  =  the  set  of  policemen  on  this  beat. 

Sup  =  the  set  of  people  who  sup  with  our  cook. 

Long  =  the  set  of  long-haired  people. 

Poet  =  the  set  of  poets. 

NoPrison  =  the  set  of  people  who  have  never  been  to  prison. 

Cousin  =  the  set  of  cousins  of  our  cook. 

The  above  seven  assumptions  then  translate  to  the  following  set  inclusions: 


1.  Police  C  Sup. 
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2.  Long  C  Poet. 

3.  Amos  G  NoPrison. 

4.  Cousin  C  Love. 

5.  Poet  C  Police. 

6.  Sup  C  Cousin. 

7.  NoPrison  C  Long. 

We  can  then  conclude  that  Amos  G  Love,  that  is,  that  Amos  Judd  loves 
cold  mutton,  as  follows: 


Amos  6  NoPrison 

(by  3). 

Thus 

Amos  G  Long 

(by  7,  NoPrison  C  Long) 

Thus 

Amos  G  Poet 

(by  2,  Long  C  Poet). 

Thus 

Amos  G  Police 

(by  5,  Poet  C  Police). 

Thus 

Amos  G  Sup 

(by  1,  Police  C  Sup). 

Thus 

Amos  G  Cousin 

(by  6,  Sup  C  Cousin). 

Thus 

Amos  G  Love 

(by  4,  Cousin  C  Love). 

The  last  line  is  the  conclusion  that  we  sought.  (Again,  along  the  way,  we 
also  deduced  that  Amos  Judd  has  long  hair;  he  is  a  poet;  he  is  a  policeman 
on  this  beat;  he  sups  with  our  cook;  and  he  is  a  cousin  of  the  cook.) 

Exercise  2.25  (page  77) 

Let  B  stand  for  the  set  of  all  babies,  I  for  the  set  of  all  illogical  persons,  D 
for  the  set  of  despised  persons  and  C  for  the  set  of  those  persons  who  can 
manage  a  crocodile.  Then  the  premises  become  : 

SCI,  C  n  D  =  0,  and  I  CD 

which  are  reflected  in  the  following  Venn  diagram: 


It  is  clear  from  this  that  no  baby  can  manage  a  crocodile,  as  a  baby  would 
be  illogical  ( B  C  /)  and  hence  despised  (I  C  D );  and  no  despised  person, 
such  as  this  baby,  could  manage  a  crocodile. 


Exercise  2.26  (page  77) 


Consider  the  following  Venn  diagram: 

F  =  things  full  of  water 
O  =  oceans 
P  =  ponds 

The  first  premise  in  the  argument  says  that  O  C  F;  and  the  second  premise 
in  the  argument  says  that  P  n  O  —  0.  These  premises  are  satisfied  by  the 
above  Venn  diagram.  However,  the  conclusion  of  the  argument  says  that 
P  n  F  =  0,  which  is  not  (necessarily)  satisfied  by  the  above  Venn  diagram. 

The  argument  is  thus  not  valid,  as  the  above  Venn  diagram  suggests  a 
counter-example  to  the  argument:  there  may  well  be  ponds  which  are  not 
oceans  yet  are  nonetheless  full  of  water. 

Exercise  2.27  (page  80) 

The  two  Venn  diagrams  are  depicted  in  Figure  15.3. 

Exercise  2.28  (page  80) 

A  n  (A  U  B)  =  {A  Pi  A)  U  {A  fl  B)  (Distributive  Law) 

=  0  U  (j4  fl  B)  (Complement  Law) 

=  (j4  D  B)  D  0  (Commutative  Law) 

=  An  B  (Empty  Set  Law) 
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Exercise  2.29  (page  81) 

•  By  Associativity,  Commutativity  and  Idempotence,  (A  n  B)  n  A  = 
An  B. 

•  Letting  X  —  An  B  and  Y  =  A,  this  says  that  X  n  Y  =  X. 

•  This  means  that  X  C  Y;  that  is,  that  An  B  C  A. 

Exercise  2.30  (page  83) 

1.  A  C  B  if,  and  only  if,  B  C  A. 

2.  A  =  B  if,  and  only  if,  (AC  B)  A  (B  C  A). 

Exercise  2.31  (page  83) 

We  might  naively  translate  the  law 
^(P  =>  Q)  o  P  A  ~^Q 

into  A  2  B  if,  and  only  if,  A  n  B  =  U.  This  law  for  sets  is  blatantly  false: 
A  n  B  =  U  can  only  be  true  if  A  =  U  and  B  =  0;  and  this  is  certainly  not 
the  only  situation  in  which  we  can  have  A  g  B. 

The  problem  arises  from  attempting  to  translate  the  negation  of  an  im¬ 
plication.  To  get  a  correct  law  for  sets  corresponding  to  the  given  law  for 
propositions,  we  first  simplify  the  law  by  negating  both  sides: 

O  -i(PA-iQ) 

Translating  into  AC  B,  and  expressing  -i  (P  A  -i Q )  as  P  A  -i Q  o  F 

gives  rise  to  the  following  valid  law  for  sets: 

A  C  B  if,  and  only  if,  A  n  B  =  0. 


Chapter  3 

Exercise  3.3  (page  89) 

It  is  straightforward,  if  a  bit  tedious,  verifying  that  each  of  these  laws  holds 
for  every  combination  of  values  of  x,  y  and  z.  For  example,  to  verify  that 
the  first  Distributivity  Law 

x  +  (yxz)  =  (x+y)  x  (x+z) 

is  true,  we  need  only  use  the  tables  defining  +  and  x  to  check  the  following 
eight  equations  are  true  (one  for  each  of  the  eight  combinations  of  values  for 
x,  y  and  z): 
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0+  (0x0)  = 
0+  (0x1)  = 
0+  (1x0)  = 
0  +  (lxl)  = 


(0+0)  x  (0+0) 
(0+0)  x  (0+1) 
(0+1)  x  (0+0) 
(0+1)  x  (0+1) 


1  +  (0x0)  = 
1  +  (0x1)  = 
1  +  (1x0)  = 
l  +  (lxl)  = 


(1+0)  x  (1+0) 
(1+0)  x  (1+1) 
(1+1)  x  (1+0) 
(1+1)  x  (1+1) 


The  details  are  omitted. 


Exercise  3.9  (page  92) 

Since  0+1  =  1  (by  Identl)  and  0x1  =  0  (by  Ident2),  the  Uniqueness  of 
Complement  Theorem  3.8  says  that  0'  =  1. 

But  then  1'  =  (O')'  =  0  by  the  Involution  Law  (Theorem  3.9). 

An  alternative  proof  which  avoid  the  use  of  the  Uniqueness  of  Comple¬ 
ment  Theorem  is  as  follows: 

0'  =  O'  +  O  (Identl) 

=  0  +  0'  (Comml) 

=  1  ( Compll ) 

Exercise  3.10  (page  93) 

1.  (xy  +  x'y')'  =  [x'  +  y')(x  +  y)  (De  Morgan,  Involution) 

=  xx1  +  xy1  +  x'y  +  yy'  (Distr,  Comm,  Assoc) 

=  xy'  +  x'y  (Compl,  Ident, 

Comm,  Assoc) 

2.  Assume  that  x  +  y  =  x  +  z  and  x'  +  y  =  x1  +  z.  Then 


1'  =  1'  •  1  (IdentZ) 

=  1-1'  ( CommZ) 

=  0  (ComplZ) 


xy  =  xx'  +  xy 
=  x(x'  +  y) 
=  x(x'  +  z ) 
=  xx'  +  xz 
=  xz 


(ComplZ,  Identl,  Comml) 
( Distr  Z) 

(Assumption  Z) 

( Distr  Z) 

(ComplZ,  Identl,  Comml) 


Thus,  with  Assumption  1,  we  have  from  Theorem  3.7  that  y  =  z. 

3.  If  x  +  y  =  0  then  x'  =  x  +  y  +  x'  =  ( x+x ')  +  y  1  ■  y  1,  so  i  0. 


By  similar  reasoning,  if  x  +  y  =  0  then  y  =  0. 

4.  If  x  =  0  then  x'  =  1  and  thus  y  =  0 y'  +  ly  =  xy'  +  x'y. 


Conversely,  if  y  =  xy'  +  x'y  for  all  y,  then  taking  y  =  0,  and  thus 
y'  =  1,  we  get  that  0  =  xy'  +  x'y  =  xl  +  x'O  =  x. 
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Exercise  3.11  (page  94) 

1.  ((z+i/Xz'+j/'))  =  (x+y')(x'+y). 

2.  If  xy  =  xz  and  x'y  =  x'z  then  y  =  z. 

3.  If  xy  =  1  then  x  =  y  =  1. 

4.  a;  =  1  if,  and  only  if,  y  =  (x+y')(x' +y)  for  all  y. 

Exercise  3.14  (page  99) 

We  start  by  expressing  x  ©  y  in  terms  of  the  three  basic  operations: 
x®y  =  (x  +  y)(xy)' 

The  circuit  for  this  is  then  as  follows: 
x 

V 


Exercise  3.15  (page  99) 

We  start  by  annotating  the  diagram  with  variables  for  all  of  the  intermediate 
values  which  are  computed: 


We  can  then  calculate  the  intermediate  and  final  values  by  considering  their 
Boolean  expressions: 


a 

b 

C 

U 

V 

W 

X 

y 

r 

u  = 

b1 

0 

0 

0 

l 

l 

l 

l 

0 

0 

0 

V  = 

c' 

0 

0 

l 

l 

0 

l 

0 

l 

0 

1 

w  = 

a  +  u 

0 

1 

0 

0 

1 

0 

1 

0 

0 

0 

X  = 

b  +  v 

0 

1 

1 

0 

0 

0 

1 

0 

0 

0 

y  = 

wc 

1 

0 

0 

1 

1 

1 

1 

0 

1 

1 

z  = 

ax 

1 

0 

1 

1 

0 

1 

0 

l 

0 

1 

r  = 

y  +  z 

1 

1 

0 

0 

1 

1 

1 

0 

1 

1 

1 

1 

1 

0 

0 

1 

1 

l 

1 

1 
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Exercise  3.16  (page  99) 


The  output  R  to  be  computed  is  given  by  the  formula  R  =  M(B'  +  C"), 
which  by  De  Morgan’s  Law  can  be  rewritten  as  R  —  Thus  either 

of  the  following  two  Boolean  circuits  will  give  a  valid  implementation. 


Exercise  3.17  (page  102) 
29  +  22  =  51. 
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Exercise  4.2  (page  ill) 

1.  {x  :  Even(x)}  —  {x  e  Z  :  x  is  even} 

=  {...,-6,-4,  -2,  0,  2,  4,6,...}. 

2.  {x  :  EvenPrimelx )}  =  {2}. 

3.  {  x  :  DeadlySin(x )  }  =  {lust,  gluttony,  greed, 

sloth,  wrath,  envy,  pride}. 

4.  {  x  :  Sum(x,  y,z )  }  =  {  (x,  y,  z)  e  Z3  :  x  +  y  =  z  } 

5.  {  x  :  Sum(u,  5,v)}  =  {  (u,  v)  £  Z2  :  u  +  5  =  v  } 

=  {  ...,  (-3,2),  (-2,3),  (-1,4), 

(0,5),  (1,6),  (2,7),  (3,8),  ...  }. 


Exercise  4.5  (page  115) 

1.  VxVy  (b(: e)  A  F(y)  =>  L(x,y )  ).. 

2.  VxVy  (B(x)  A  L(x,y)  =>  F(j/)). 

3.  \fx\/y  (F(y)  A  L(x,y)  =>  B(a;)). 

Exercise  4.7  (page  117) 

1.  \/x  (^Male(x)  ©  Female(x )). 

2.  Va;  fy3y  Mother(x,y)  =>  Parent(x,y)  A  Female(x )). 
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3.  \/x3m3fVy  [(Mother(y,x)  o  y=m)  A  (Fatherly,  x)  o  y=f)'j- 

4.  \/x\/y  ^Siblmg(x,y)  =>  V  z  (  P  arent(z ,  x)  o  Parent(z,y ))). 

5.  \/xVy  (yCousin(x,y)  => 

3u3v  (Parent(u,x)  A  Parent(v,y )  A  Siblmg(u,v ))). 

Exercise  4.8  (page  117) 

The  premise  of  the  argument  translates  into 
Horse(h)  =>  Ammal(h )) 

which  says  that  any  thing  h  which  is  a  horse  is  an  animal. 

The  conclusion  of  the  argument  translates  into 

Va:  (  3 h[Horse(h)  A  Head[x,  h))  => 

3a(j4nimal(a)  A  Head(x,  a)) ) 

which  says  that  any  thing  a;  which  is  the  head  of  some  horse  h  is  the  head 
of  some  animal  a. 

This  argument  is  valid:  for  suppose  x  is  the  head  of  some  horse  h  (Black 
Beauty,  say).  Since  the  premise  says  that  all  horses  are  animals,  this  par¬ 
ticular  horse  h  (Black  Beauty)  is  an  animal;  and  hence  this  thing  x  is  the 
head  of  some  animal  a,  namely  h  (Black  Beauty). 

Exercise  4.9  (page  120) 

1.  3!c ( T[Alice,  c)  A  T[Bob,c)) 

2.  3cx  (^T [Alice,  Cx)  A  T(Bob,Cx ) 

A  3!c2  ( T[Alice,  c2)  A  T[Bob,c2 )  A  cx^c2)) 

Exercise  4.10  (page  121) 

1.  3 x  Likes Maths[x),  where  LikesMaths[x)  =  “x  likes  maths”. 

Its  negation  is  (b). 

(a)  3 x -iLikesMaths(x). 

(b)  Vx  -iLikesMaths(x). 

(c)  Va:  Likes Maths[x). 

2.  \/x[Fur[x)  A  Tail{x ) ),  where  Fur(x)  =  “x  has  fur”  and  Tail(x )  = 
“x  has  a  tail” . 

Its  negation  is  (c). 

(a)  ~i3x  (Fur(x)  A  Tail[x)). 
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(b)  3a;  (  ^Fur(x)  A  ~^Tail(x)). 

(c)  3a;  (  -iPur(x)  V  -i Tail(x ) ). 

3.  \/x  (^Vaccinated(x)  =>  Sick(x) ),  where  Vaccinated(x)  =  “x  has 
been  vaccinated”  and  Sick[x)  =  “x  got  Sick”. 

Its  negation  is  (c). 

(a)  Va;  (  Vaccinated{x)  =>  ^Sick{x) ). 

(b)  3 x  (Vaccinated{x)  A  Sick(x)). 

(c)  3a;  (  -i Vaccmated(x)  A  -^Sick(x)). 

Exercise  4.12  (page  125) 

Let  Loves(x,y )  =  “x  loves  y" ,  where  the  universe  of  discourse  is  the  set  of 
people. 

1.  Everybody  loves  somebody:  Va;3 yLoves(x,y). 

Somebody  is  loved  by  everybody:  3 x\/y  Loves(y,x). 

These  English  statements  are  ambiguous,  as  each  may  be  interpreted 
as  saying  precisely  what  the  other  is  saying.  However,  the  likely  inter¬ 
pretation  for  each  is  as  formalised  in  predicate  logic  above. 

This  argument  is  not  valid.  For  example,  perhaps  Alice  only  loves 
herself,  but  everyone  else  loves  Bob  (including  Bob  himself);  in  this 
scenario,  the  premise  is  true,  but  the  conclusion  is  false. 

2.  Somebody  loves  everybody:  3a ;Vy  Loves(x,y). 

Everybody  is  loved  by  somebody:  Va;3t/  Loves(y,  x). 

This  argument  is  valid.  The  premise  of  the  argument  says  that  there 
is  some  person  -  Theresa  say  -  who  loves  everybody.  This  means  that 
the  conclusion  of  the  argument  must  be  true  as  well:  everybody  is 
loved  by  someone,  in  particular  by  Theresa. 

Exercise  4.13  (page  127) 
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Exercise  5.2  (page  134) 
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(Fact  15.14) _ 

AliBCC  ^  ACC  A  B  CC. 


Proof:  Assume  that  AuBCC;we  must  show  that  ACC  A  B  C  C. 
This  means  that  we  must  show  both  ACC  and  B  C  C. 

We  consider  ACC  first.  By  the  definition  of  the  set  inclusion  ACC, 
we  choose  an  arbitrary  element  x  e  A  and  we  show  that  x  e  C.  Since  x  e  A, 
it  is  thus  also  the  case  that  x  e  Au  B.  Hence,  by  our  assumption,  x  e  C. 
We  have  thus  shown  that  ACC. 

The  proof  that  BCCis  very  similar.  □ 


Exercise  5.3  (page  136) 

Fact:  If  a  and  b  are  both  odd  integers,  then  ab  is  an  odd  integer. 

Proof:  Assume  that  a  and  b  are  odd  integers. 

An  odd  integer  is  one  more  than  twice  an  integer. 

Thus  a  =  2p+l  and  b  =  2g+l  for  some  integers  p  and  q. 

Hence  ab  =  (2p+l)(2g+l)  =  Apq  +  2p  +  2q  +  1 
=  2(2  pq  +  p  +  q)  +  1 
=  2fc+l  for  the  integer  k  =  2 pq  +  p  +  q. 

Therefore,  ab  is  an  odd  integer.  □ 


Exercise  5.5  (page  138) 

If  the  sum  of  the  digits  of  a  number  is  divisible  by  3,  then  that  number  itself 
is  divisible  by  3. 

•  The  sum  of  the  digits  of  45  is  4+5  =  9,  which  is  divisible  by  3;  so  by 
modus  ponens,  45  itself  is  divisible  by  3. 

•  The  sum  of  the  digits  of  9  839  853  is  9+8+3+9+8+5+3  =  45,  which  is 
divisible  by  3;  so  by  modus  ponens,  9  839  853  itself  is  divisible  by  3. 

Exercise  5.9  (page  141) 

Fact:  There  is  no  smallest  positive  rational  number. 


428  Temporal  Properties 


Proof:  Assume  to  the  contrary  that  a  >  0  is  the  smallest  rational  number. 

Then  b  =  a/2  is  a  positive  rational  number  which  is  smaller  than  a, 
contradicting  our  assumption  that  a  is  the  smallest  such  number. 

Hence  there  cannot  be  a  smallest  positive  rational  number.  □ 


Exercise  5.10  (page  142) 

Fact:  Every  integer  greater  than  1  can  be  written  as  a  product  of  prime 
numbers. 

Proof:  Assume  to  the  contrary  that  not  all  integers  greater  than  1  can  be 
written  as  a  product  of  prime  numbers, 

Let  n  be  the  smallest  such  integer;  thus,  every  smaller  integer  greater 
than  1  can  be  written  as  a  product  of  primes. 

By  assumption,  n  cannot  be  prime,  so  n  =  pq  where  p  and  q  are  two 
smaller  integers  greater  than  1. 

Since  p  and  q  are  smaller  than  n,  they  must  themselves  each  be  a 
product  of  primes. 

But  then  n  must  be  a  product  of  primes  as  well,  namely  the  product 
of  those  primes  making  up  p  and  q,  contradicting  the  definition  of  n. 

Hence  every  integer  greater  than  1  can  be  written  as  a  product  of  prime 
numbers.  □ 


Exercise  5.13  (page  145) 

Fact:  If  a  and  b  are  integers  and  ab  is  even,  then  either  a  is  even  or  b  is  even. 

Proof:  Assume  that  a  and  b  are  integers  and  that  ab  is  even.  That  is, 
ab  =  2 p  for  some  integer  p. 

Suppose  that  a  is  odd;  that  is,  suppose  that  a  —  2q+l  for  some  integer  q. 

Then  ab  =  (2g+l)6  =  2 qb  +  b;  and  since  ab  =  2 p,  this  means  that 
2 p  —  2 qb  +  b,  and  thus  that  b  =  2p  —  2 qb  =  2 (p  —  qb). 

Since  p  —  qb  is  an  integer,  this  means  that  b  must  be  even. 


Thus,  if  a  is  not  even,  then  b  must  be  even;  that  is,  either  a  or  b  is  even. 
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Exercise  5.14  (page  146) 

Fact:  If  A  C  B  then  either  x  ^  A  or  x  g  B. 

Proof:  Assume  that  AC  B. 

Suppose  that  that  is,  that  it  is  not  the  case  that  x  £  A. 

Then  since  A  C  B,  we  must  have  that  x  e  B. 

Thus,  either  x  ^  A,  or  x  6  B. 

Exercise  5.15  (page  147) 

Fact:  For  real  numbers  a  and  b,  |a  +  b\  <  |a|  +  |6|. 

Proof:  Since  |a  +  6|  =  |6  +  a|,  we  can  assume  without  any  loss  of  generality 
that  |a|  >  ]6|. 

•  Either  a  and  b  have  the  same  sign  -  that  is,  they  are  both  nonnegative 
(i.e.,  greater  than  or  equal  to  0)  or  they  are  both  negative; 

•  or  a  and  b  have  opposite  signs  -  that  is,  one  is  nonnegative  and  the 
other  is  negative. 

We  shall  consider  these  two  cases  in  turn. 

•  If  a  and  b  have  the  same  sign,  then  \a  +  b\  =  |a|  +  |6|  <  |a|  +  |6|. 

•  If  a  and  b  have  opposite  signs,  then  |a  +  b\  =  |a|  —  |6|  <  |a|  +  |6|. 

In  either  case,  the  result  is  true.  □ 

Exercise  5.16  (page  147) 

Fact:  If  n  is  an  integer,  then  the  final  digit  of  n2  is  0,  1,  4,  5,  6  or  9. 

Proof:  We  can  prove  this  by  breaking  down  the  problem  into  cases  de¬ 
pending  on  the  final  digit  of  n: 

•  If  the  final  digit  of  n  is  0,  then  the  final  digit  of  n2  will  be  0. 

•  If  the  final  digit  of  n  is  1  or  9,  then  the  final  digit  of  n2  will  be  1. 

•  If  the  final  digit  of  n  is  2  or  8,  then  the  final  digit  of  n2  will  be  4. 

•  If  the  final  digit  of  n  is  3  or  7,  then  the  final  digit  of  n2  will  be  9. 

•  If  the  final  digit  of  n  is  4  or  6,  then  the  final  digit  of  n2  will  be  6. 

•  If  the  final  digit  of  n  is  5,  then  the  final  digit  of  n2  will  be  5. 


This  exhausts  all  possibilities  for  the  final  digit  of  n,  and  hence  the  result 
must  be  true.  □ 
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Exercise  5.17  (page  147) 

Saying  that  it  is  not  the  case  that  x  7  and  y  8  means  that  either  x  =  7 
or  y  —  8,  not  that  both  of  these  equalities  holds. 


Exercise  5.18  (page  149) 

Fact:  If  A  and  B\C  are  disjoint  then  An  B  C  C. 

Proof:  Assume  that  A  and  B\C  are  disjoint.  From  this  assumption,  we 
need  to  prove  that  AnBCC;  that  is,  that  for  any  x,  if  x  e  A  n  B  then 
xeC: 

\/x  (x  e  A  n  B  =>  x  e  c). 

To  this  end,  let  a  be  an  arbitrary  value. 

To  show  that  aeAnB=>aeC,we  assume  that  a  e  A  n  B  and 
prove  from  this  assumption  that  a  6  C. 

Assume  then  that  a  n  An  B\  that  is,  that  a  e  A  and  a  e  B. 

Since  A  and  B\C  are  disjoint  (from  a  premise  of  the  proposition)  and 
a  6  A,  we  must  have  that  a  £  B\C. 

But  since  a  G  B,  a  £  B  \C  means  we  must  have  that  a  e  C.  □ 


Exercise  5.19  (page  151) 

Fact:  Vi>0  3  y  ( y(y+l)  —  x ). 

Proof:  Let  x  >  0  be  arbitrary,  and  let  y  =  j  (  —  1  +  v^H-Aa: ) . 

Then  y{y-\~  1)  —  ^  (  —  1  +  ^/1-fAx  ^  —  1  H-  \[\-\Ax  ^  +  l^J 

—  ^  (v/T+4a;  —  l^J  ^^/T+4x  +  l^j 

=  \  ((1+4®)  —  l)  =  5(42:)  =  x  □ 

Where  did  this  value  of  y  come  from?  Given  x>0,  we  want  a  value  y 
satisfying  y(y+ 1)  =  x,  or  in  other  words,  by  expanding  and  rewriting  this 
equation,  a  solution  y  to  the  quadratic  equation 

y2  +  y  —  x  =  0. 

The  quadratic  formula  tells  us  that  the  two  values  for  y  which  solve  this 
equation  are 

r  _  -1  ±  v/TT4^ 

y  -  2  • 
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Only  one  of  these  two  solutions  is  positive  as  required,  namely 
y  =  2  (  —  1  “1“  \/T+4£  )  • 

Exercise  5.20  (page  152) 

Fact:  3a :  (p(i)  V  Q(i))  O  3a:P(a;)  V  3a:<3(a;). 

Proof:  (=>)  Suppose  3a:  (P(a:)  V  Q(a;)). 

Then  P(a)  V  Q(a )  holds  for  some  a. 

For  this  value  a,  either  P(a)  holds  or  Q(a)  holds. 

-  If  P(a')  holds,  then  3a;P(a;),  and  thus  3a:  P(a;)  V  3a;Q(a;). 

-  If  Q(a)  holds,  then  3a;Q(a:),  and  thus  3a:P(a;)  V  3a:Q(a;). 

In  either  case,  3a;P(a;)  V  3a;Q(a:). 

(•^=)  Suppose  3a:P(a;)  V  3a:Q(a;). 

Then  either  3a;  P(x)  holds,  or  3a;  Q(x)  holds. 

-  If  3a:  P(x)  holds,  then  P(a )  holds  for  some  value  a. 

For  this  a,  P(a)  V  Q(a)  holds,  and  so  3a;  (P(a;)  V  Q(a;)). 

-  If  3a:  Q(x)  holds,  then  Q(a)  holds  for  some  value  a. 

For  this  a,  P(a)  V  Q(a)  holds,  and  so  3a;  (P(a;)  V  Q(a;)). 

Thus,  in  either  case,  3a:  (P(a;)  V  Q(a;)).  □ 


Exercise  5.22  (page  153) 

Fact:  There  is  a  unique  set  A  such  that,  for  every  set  B,  A  U  B  =  B. 

Proof:  To  show  existence  of  such  a  set,  we  simply  note  that  the  empty  set 
0  clearly  has  the  desired  property,  as  0  u  B  =  B  for  every  set  B. 

To  show  that  0  is  the  only  set  with  this  property,  assume  that  some 
set  A  satisfies  this  property;  in  particular,  taking  B  =  0,  this  means  that 
A  U  0  =  0.  But  then  j4  =  j4u0  =  0.  □ 


Chapter  6 

Exercise  6.2  (page  158) 

1.  range(score)  s=  {46,  54,  59,  64,  68,  75,  78,  88,  92,  100}. 

2.  score_1({n  G  N  :  n  >  70}). 
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Exercise  6.3  (page  158) 

1.  Mother  is  a  function  as  every  person  has  one  and  exactly  one  (biolog¬ 
ical)  mother. 

2.  Parent  is  not  a  function  as  people  have  two  parents  not  one. 

3.  Child  is  not  a  function  as  a  person  may  have  any  number  of  children. 

4.  FirstBornChild  is  not  a  function  as  a  person  may  have  no  children. 

Exercise  6.4  (page  160) 

graph(/)  =  { (1,  c),  (2,  a),  (3,  c)  }. 

Exercise  6.5  (page  161) 

1.  The  function  score  is  not  one-to-one  as,  for  example,  score(Collins)  = 
score(Parker).  Also,  score(Evans)  =  score( Williams). 

2.  The  function  /  :  R  — >  R  defined  by  f(x)  =  x2  is  not  one-to-one  as,  for 
example,  /(—  1)  =  /( 1).  (In  fact,  f(x)  =  f(—x)  for  any  value  x  e  R.) 

3.  The  function  /  :  N  — >  N  defined  by  f(x)  =  x2  is  one-to-one. 

Exercise  6.6  (page  161) 

1.  The  function  score  is  not  onto  as,  for  example,  no  one  has  scored  0. 

2.  The  function  /  :  R  — >  R  defined  by  f(x)  —  x2  is  not  onto  as  f(x)  >  0 
for  all  x  e  R  so,  for  example,  for  no  x  e  R  do  we  have  x2  =  —1. 

3.  The  function  /  :  N  — >  N  defined  by  f(x)  =  x2  is  not  onto  as,  for 
example,  for  no  x  e  N  do  we  have  x2  =  3. 

Exercise  6.7  (page  161) 

•  /i  is  one-to-one  but  not  onto,  as  there  is  an  element  of  the  codomain 
(the  third  element  from  the  top)  which  is  not  in  the  range  of  the 
function. 

•  f2  is  onto  but  not  one-to-one,  as  f2  maps  two  elements  of  the  domain 
(the  top  and  bottom  elements)  to  the  same  element  of  the  codomain 
(the  middle  of  the  three  elements). 

•  /3  is  not  one-to-one,  as  it  maps  two  elements  of  the  domain  (the  first 
two  elements)  to  the  same  element  of  the  codomain  (the  third  element 
from  the  top);  nor  is  /3  onto,  as  there  is  an  element  of  the  codomain 
(the  second  element  from  the  top)  which  is  not  in  the  range  of  the 
function. 


/4  is  both  one-to-one  and  onto, 
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Exercise  6.8  (page  163) 

abcdefghijklm 

TTTTTiTTTTTTT 

xiepdtovzsbrnj 
n  o  p  q  r  s  t  u  v  w  x  y  z 

TTTTTTTTTTTTT 

wrqnuf  chi  9  V  a  k 

Exercise  6.9  (page  164) 


f°g  g°f 


Exercise  6.10  (page  165) 

If  /  :  A  — »  B  and  g  :  B  — >  C  are  both  bijections,  then  they  are  both  one-to- 
one  and  onto.  Therefore,  go  f  :  A  — >  C  is  both  one-to-one  (by  Theorem  6.9) 
and  onto  (by  Theorem  6.10),  and  thus  it  is  a  bijection. 

Exercise  6.11  (page  165) 

Let  a  e  A  be  arbitrary.  By  the  definition  of  the  inverse  of  a  bijection, 
Definition  6.7  (page  162),  if  /-1(/(a))  =  x  then  f(x)  =  /(a).  Since  /  is 
one-to-one,  this  means  that  x  =  a.  Hence  /_1(/(a))  =  a  for  any  a£i;  that 

is,  /-1  °  /  =  idji- 

Let  b  e  B  be  arbitrary.  Again  by  Definition  6.7,  if  /(/_1(6))  =  y  then 
y  =  b.  Hence  /(/_1(6))  =  b  for  any  b  B  B\  that  is,  /  o  /-1  =  idB. 

Exercise  6.12  (page  166) 

(ho  (go  /))(i)  =  /i((jo  =  h(g(f(x ))) 

=  (hog)(f(x))  -  ((hog)o  /)(a:). 

Exercise  6.14  (page  170) 

(  2n—  1,  if  n  >  0; 

/-»  = 

(  —2 n,  if  n  <  0 
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Exercise  6.15  (page  170) 

Take  h  =  g  a  /-1  which,  by  Exercise  6.10,  is  guaranteed  to  be  a  bijection. 

Exercise  6.16  (page  172) 

Given  the  bijection  /  :  N  — >  Q+  from  Example  6.15,  the  function  g  :  N  — >  Q 
defined  by 

0,  if  n  =  0, 

g(n)  =  ,  f  (”  2  1 )  ’  if  n  >  0  is  odd- 

— /  ,  if  n  >  0  is  even, 

is  a  bijection. 

Exercise  6.17  (page  173) 

Consider  any  element  a  e  A. 

•  If  a  6  B  then  by  definition  of  B,  a  £  /(a),  so  B  yt  /(a). 

•  If  a  £  B  then  by  definition  of  B,  a  €  /(a),  so  again  B  f(a). 

We  thus  have  that  B  f(a )  for  every  a  e  A,  that  is,  /  cannot  be  onto. 

Exercise  6.18  (page  174) 

Let  5  =  {  1,  2  },  and  let  /  :  V  (S)  ->  V  (5)  be  defined  by: 

fW  =  /({!})  =  {1}  and  /({ 2 })  =  /(S)  =  {2}. 

The  subsets  {1}  and  {2}  are  clearly  fixed  points  of  /,  and  are  the  only  fixed 
points  of  /.  As  {1}  g  {2}  and  {2}  2  {1 },  these  are  neither  greatest  nor 
least  fixed  points. 

Exercise  6.20  (page  176) 


1.  If  S  C  T,  then 

f(S)  =  {0}u{n+2  :  n  e  5} 

C  { 0 }  U  { n+2  :  neT}  =  f(T). 


to 

II 

o 

/(N)  =  N  \  {  1 } 

/2(0)  =  {  0,2} 

/2(N)  =  N  \  {  1,  3} 

/3(0)  =  {O,  2,4} 

/3(N)  =  N  \  {  1,  3,  5} 

/B(0)  =  { 0,  2,  ...,  2n— 2} 

/”(N)  =  N  \  {  1,  3,  ...,  2n— 1  } 
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3.  L  =  G  =  {0,  2,  4,  6, 

Chapter  7 

Exercise  7.1  (page  180) 

1.  Q  =  {r  e  Bond  Films  :  r  was  directed  by  Lewis  Gilbert } 

=  {  r 03,  r06,  r07  }. 

2.  Q  =  {r  e  Bond  Films  :  r  was  released  in  the  1970s } 

=  {  r05,  r06,  r07  }. 

Exercise  7.3  (page  183) 

Starsln  =  {  (Sean  Connery,  Dr.  No), 

(Sean  Connery,  Thunderball), 

(Sean  Connery,  You  Only  Live  Twice), 

(George  Lazenby,  On  Her  Majesty’s  Secret  Service), 
(Sean  Connery,  Diamonds  Are  Forever), 

(Roger  Moore,  The  Spy  Who  Loved  Me), 

(Roger  Moore,  Moonraker), 

(Roger  Moore,  For  Your  Eyes  Only), 

(Sean  Connery,  Never  Say  Never  Again), 

(Roger  Moore,  Octopussy), 

(Roger  Moore,  A  View  to  a  Kill), 

(Timothy  Dalton,  The  Living  Daylights), 

(Timothy  Dalton,  Licence  to  Kill), 

(Pierce  Brosnan,  Golden  Eye), 

(Pierce  Brosnan,  Tomorrow  Never  Dies), 

(Pierce  Brosnan,  The  World  Is  Not  Enough), 

(Pierce  Brosnan,  Die  Another  Day), 

(Daniel  Craig,  Casino  Royale), 

(Daniel  Craig,  Quantum  of  Solace), 

(Daniel  Craig,  Skyfall)  }. 

Exercise  7.5  (page  184) 

Letting  SC,  GL,  TD,  PB,  and  DC  stand  for  Sean  Connery,  George  Lazenby, 
Roger  Moore,  Timothy  Dalton,  Pierce  Brosnan  and  Daniel  Craig,  respec¬ 
tively,  the  binary  relation  Before  consists  of  the  following  pairs: 

Before  =  {  (SC,  SC),  (SC,  GL),  (SC,  RM),  (SC,  TD), 
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(SC,  PB),  (SC,  DC),  (GL,  SC),  (GL,  RM), 
(GL,  TD),  (GL,  PB),  (GL,  DC),  (RM,  SC), 
(RM,  RM),  (RM,  TD),  (RM,  PB),  (RM,  DC), 
(TD,  TD),  (TD,  PB),  (TD,  DC),  (PB,  PB), 

(PB,  DC),  (DC,  DC)  }. 

This  binary  relation  can  be  visualised  as  follows: 


o 


The  binary  relation  FirstBefore  consists  of  the  following  pairs: 

FirstBejore  =  {  (SC,  GL),  (SC,  RM),  (SC,  TD), 
(SC,  PB),  (SC,  DC),  (GL,  RM), 
(GL,  TD),  (GL,  PB),  (GL,  DC), 
(RM,  TD),  (RM,  PB),  (RM,  DC), 
(TD,  PB),  (TD,  DC),  (PB,  DC)  }. 


This  binary  relation  can  be  visualised  as  follows: 


Exercise  7.6  (page  185) 

Child  —  |  (Donald,  Quackmore),  (Donald,  Hortense), 
(Della,  Quackmore),  (Della,  Hortense), 

(Huey,  Della),  (Louis,  Della),  (Dewey,  Della)}. 


Additional  Exercises  437 


Brother  =  j  (Scrooge,  Hortense),  (Donald,  Della), 

(Huey,  Louis),  (Huey,  Dewey),  (Louis,  Huey), 
(Louis,  Dewey),  (Dewey,  Huey),  (Dewey,  Louis)}. 
Sister  =  j  (Hortense,  Scrooge),  (Della,  Donald)  }. 

Sibling  =  j  (Scrooge,  Hortense),  (Hortense,  Scrooge), 

(Donald,  Della),  (Della,  Donald), 

(Huey,  Louis),  (Louis,  Huey), 

(Huey,  Dewey),  (Dewey,  Huey), 

(Louis,  Dewey),  (Dewey,  Louis)  }. 

The  Child  relation  can  be  visualised  as  follows. 

Quackmore  Hortense  Scrooge 

Donald  Della 

1  X 

Huey  Louis  Dewey 


Exercise  7.7  (page  187) 

1.  Rj  U  R,  =  R3. 

2 .  /?3  n  /?2  —  R\  • 

3.  R3  \  Ri  =  R3. 

Exercise  7.8  (page  188) 

Sibling -1  =  Sibling. 

Exercise  7.9  (page  189) 

•  Uncle  —  Parent  o  Brother  (an  uncle  is  a  brother  of  a  parent). 

In  the  case  of  the  Duck  family,  we  have: 

Uncle  =  {  (Scrooge,  Donald),  (Scrooge,  Della), 

(Donald,  Huey),  (Donald,  Louis),  (Donald,  Dewey)  }. 

The  first  two  pairs  arise  from  the  fact  that  Scrooge  is  a  brother  of 
Hortense,  who  is  a  parent  of  Donald  and  Della. 

The  final  three  pairs  arise  from  the  fact  that  Donald  is  a  brother  of 
Della,  who  is  a  parent  of  Huey,  Louis  and  Dewey. 
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•  Nephew  =  Sibling  o  Son  (a  nephew  is  a  son  of  a  sibling). 

In  the  case  of  the  Duck  family,  we  have: 

Nephew  =  {  (Donald,  Scrooge),  (Huey,  Donald), 

(Louis,  Donald),  (Dewey,  Donald)  }. 

The  first  pair  arises  from  the  fact  that  Donald  is  a  son  of  Hortense, 
who  is  a  sibling  of  Scrooge. 

The  final  three  pairs  arise  from  the  fact  that  Huey,  Louis  and  Dewey 
are  sons  of  Della,  who  is  a  sibling  of  Donald. 

Exercise  7.10  (page  190) 

This  follows  easily  from  property  (*)  of  Theorem  7.6. 

Exercise  7.11  (page  191) 

The  relation  Before  is  not  reflexive,  as  George  Lazenby  is  not  related  to 
himself  by  this  relation.  (Having  starred  in  only  one  film,  he  could  not  have 
appeared  in  one  film  before  starring  in  another  film.) 

The  relation  Before  is  also  not  irreflexive,  as  all  of  the  other  actors  who 
have  played  James  Bond  have  done  so  on  more  than  one  occasion,  so  each 
of  them  is  related  to  himself  by  the  Before  relation. 

The  relation  FirstBefore  is  irreflexive  (and  thus  it  is  not  reflexive),  as  an 
actor  could  not  have  starred  as  James  Bond  before  starring  as  James  Bond. 

Exercise  7.12  (page  191) 

The  relation  Before  is  not  symmetric;  for  example,  it  contains  the  pair 
(SC,TD)  but  not  the  pair  (TD,SC).  Nor  is  it  antisymmetric;  for  example,  it 
contains  the  pairs  (SC,GL)  and  (GL.SC),  and  SCytGL. 

The  relation  FirstBefore  is  not  symmetric;  for  example,  it  contains  the 
pair  (SC,GL)  but  not  the  pair  (GL,SC).  However,  it  is  antisymmetric:  given 
two  James  Bond  actors,  one  of  the  two  will  not  have  starred  as  James  Bond 
before  the  other. 

Exercise  7.13  (page  192) 

The  relation  Before  is  not  transitive;  for  example,  it  contains  the  pairs 
(RM,SC)  and  (SC,GL),  but  not  the  pair  (RM,GL). 

The  relation  FirstBefore  is  transitive:  if  one  actor  starred  as  James  Bond 
before  a  second  actor,  who  in  turned  starred  as  James  Bond  before  a  third 
actor,  then  the  first  actor  will  naturally  have  starred  as  James  Bond  before 
the  third  actor. 
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Exercise  7.14  (page  192) 

The  is- an- ancestor- of  relation  is 

•  not  reflexive,  but  in  fact  irreflexive,  as  a  person  cannot  be  their  own 
ancestor; 

•  not  symmetric,  but  in  fact  antisymmetric,  as  a  person  cannot  be  an 
ancestor  of  their  own  ancestor;  and 

•  transitive,  as  an  ancestor  of  an  ancestor  is  again  an  ancestor. 

The  is-married-to  relation  is 

•  not  reflexive,  but  in  fact  irreflexive,  as  a  person  cannot  be  married  to 
themselves; 

•  symmetric,  and  not  antisymmetric,  as  the  person  you  are  married  to 
is  of  course  married  to  you;  and 

•  not  transitive,  as  otherwise  a  married  person,  by  symmetry,  would 
then  have  to  be  married  to  themselves. 

Exercise  7.18  (page  194) 

1.  This  is  a  partial  order  but  not  a  total  order;  and  it  is  an  equivalence 
relation. 

2.  This  is  not  a  partial  order  (it  is  not  antisymmetric),  and  hence  not  a 
total  order;  but  it  is  an  equivalence  relation. 

3.  This  is  not  a  partial  order  (it  is  not  antisymmetric),  and  hence  not  a 
total  order;  but  it  is  an  equivalence  relation. 

Exercise  7.19  (page  194) 

Ri  is  an  equivalence  relation,  as  it  is  clearly  reflexive  (a  student  takes  all 
the  same  courses  as  themselves),  symmetric  (if  x  takes  all  the  same  courses 
as  y  then  y  takes  all  the  same  courses  as  x)  and  transitive  (if  x  takes  all  the 
same  courses  as  y  and  y  takes  all  the  same  courses  as  z  then  x  takes  all  the 
same  courses  as  z). 

R2  is  not  an  equivalence  relation,  as  it  is  not  transitive  (though  it  is 
reflexive  and  symmetric).  For  example,  Alice  and  Bob  might  take  the  same 
Mathematics  course,  and  Bob  and  Carol  might  take  the  same  Computing 
course,  while  Alice  and  Carol  do  not  take  any  of  the  same  courses. 

Exercise  7.21  (page  195) 

The  finest  partition  of  a  set  A  consists  of  singletons:  {{a}  :  a  e  A  }. 

The  coarsest  partition  of  a  set  A  consists  of  one  set:  {A}. 
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Exercise  7.23  (page  196) 

The  equivalence  relation  defined  by  the  finest  partition  of  a  set  A  is  the 
identity  relation:  I  a  =  {  (a,  a)  :  a  £  A}. 

The  equivalence  relation  defined  by  the  coarsest  partition  of  a  set  A  is  the 
universal  relation:  UA  =  {  (a,  b)  :  a,beA}. 

Exercise  7.24  (page  196) 

The  relation  R  partitions  the  set  A  into  the  following  18  equivalence  classes: 


{1} 

[2]  =  {2,4,  8,  16} 

[3]  =  { 3,  9,  27} 

{5,  25} 

[6]  =  {6,  12,  18,  24} 

[7]  =  {7} 

{10,  20} 

[11]  =  {11} 

[13]  =  {13} 

{14,  28} 

[15]  =  {15} 

[17]  =  {17} 

{19} 

[21]  =  {21} 

[22]  =  {22} 

{23} 

[26]  =  {26} 

[29]  =  {29} 

Chapter  8 

Exercise  8.1  (page  203) 

4  g  N:  By  clause  (1),  0  g  N,  so  by  clause  (2),  1  g  N;  so  by  clause  (2),  2  g  N; 
so  by  clause  (2),  3  g  N;  and  so  finally  by  clause  (2),  4  g  N. 


4.5  ^  N:  Since  4.5  0,  clause  1  does  not  apply,  so  we  could  only  infer  that 

4.5  g  N  from  clause  (2),  and  thus  from  first  inferring  that  3.5  g  N; 
but  by  a  similar  reasoning  we  could  only  infer  this  by  first  inferring 
that  2.5  g  N;  which  we  could  only  infer  by  first  inferring  that  1.5  £  N; 
which  we  could  only  infer  by  first  inferring  that  0.5  g  N;  which  we 
could  only  infer  by  first  inferring  that  —0.5  g  N;  which  we  could  only 
infer  by  first  inferring  that  —1.5  g  N;  et  cetera  ad  infinitum.  This 
process  would  never  “bottom  out”,  so  we  could  never  infer  that  any 
of  these  were  in  N. 

Alternatively,  we  can  easily  see  that  the  set  {  0,  1,  2,  3,  4,  ...  }  satisfies 
clauses  (1)  and  (2)  of  the  definition;  and  since  N  is  being  defined  to  be 
the  smallest  set  satisfying  these  clauses,  N  must  be  a  subset  of  this; 
since  this  set  does  not  contain  4.5,  4.5  ^  N. 
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Exercise  8.2  (page  204) 

Odd  is  defined  to  be  the  smallest  set  satisfying  the  two  clauses.  The  fact 
that  N  satisfies  these  two  clauses  only  implies  that  Odd  C  N;  that  is,  N  is 
not  necessarily  (and  in  fact  is  not)  the  smallest  such  set. 

Exercise  8.3  (page  204) 

Powers-of-2  is  the  smallest  set  satisfying  the  following: 

1.  1  e  Powers-of-2. 

2.  If  n  £  Powers-of-2  then  2 n  £  Powers-of-2. 

Exercise  8.4  (page  205) 

Given  a  set  A,  the  smallest  set  P(A)  satisfying: 

1.0C  P{A)\  and 

2.  if  X  £  P  and  a  £  A  then  X  U  {a}  £  P(A) 

is  the  set  of  all  finite  subsets  of  A.  This  is  the  same  as  the  powerset  V(A) 
of  A  only  in  the  case  when  A  is  a  finite  set. 

Exercise  8.5  (page  207) 

PosDecimalNumbers  is  inductively  defined  as  the  smallest  set  satisfying 
the  following: 

1.  1,  2,  3,  4,  5,  6,  7,  8,  9  e  PosDecimalNumbers; 

2.  If  w  £  PosDecimalNumbers  and  x  £  DecimalDigits 
then  wx  £  PosDecimalNumbers. 

Exercise  8.6  (page  208) 

The  following  is  a  BNF  equation  for  formulae  of  predicate  logic. 

p,q  ::=  true  |  false  |  P(xlt...,xn) 

|  — ip  |  pVg  |  pAq  |  p  q  \  p  O  q  \  Vaip  |  3a: p 

Here,  P(x1, . . . ,  xn)  is  taken  to  range  over  the  set  of  predicates  with  free 
variables  taken  from  xlt . . .  ,x„  and  x  is  taken  to  range  over  all  variables. 

Exercise  8.7  (page  212) 

The  dictionary  data  structure  can  be  defined  using  the  following  BNF  equa¬ 
tion: 
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d  =  -k  |  N(w,  di,  d2) 

where  w  ranges  over  words  (representing  names).  That  is,  a  dictionary  is 
either  a  leaf  (if  it  is  empty),  or  it  consists  of  a  name  along  with  two  sub¬ 
dictionaries  d1  and  d2.  (Note  that  the  semantic  understanding  of  a  dictio¬ 
nary,  i.e. ,  the  property  that  the  stored  names  are  ordered  lexicographically 
throughout  the  dictionary,  is  not  reflected  in  this  data  structure  definition, 
only  its  syntactic  structure.) 

Exercise  8.8  (page  213) 

•  Si  —  Sq  fir  2-1  —  1  —  0+2  —  1  —  1 

•  s2  =  Sj  +  2-2  -  1  =  1  +  4-  1  =  4 

•  S3  —  52  +  2*3  —  1  —  4  +  6  —  1  —  9 

•  S4  —  S3  +  2*4  —  1  —  9  +  8  —  1  —  16 

•  s5  =  s4  +  2-5  -  1  =  16  +  10-1  =  25 

•  Sq  —  S5  +  2*6  —  1  —  25  +  12  —  1  —  36 

It  would  appear  (though  it  is  as  yet  uncertain)  that  s„  =  n2. 


Exercise  8.9  (page  213) 


We  could  readily  compute 


However,  by  the  inductive  definition  we  would  proceed  as  follows: 


•  =  H0  +  \ 

•  H2  =  Hx+\ 

•  =  H2  +  ^ 

•  W4  =  H3  +  l 

•  H5  =  Ht  +  i 
.  H6  =  H5  +  1 


0  +  1  =  1 


1  +  2  - 

II 

co  loq 

1.5 

§+£  = 

11  ^ 

"6"  ~ 

j  1.833 

11  +  1  - 

25 

-  12 

«  2.083 

25  ,  1 

12  +  5  ' 

137 
“  60 

«  2.283 

137  ,  1 
60+6 

49 

-  20 

=  2.45 

Exercise  8.10  (page  214) 

At  the  start  of  month  n  you  will  have  /„  pairs  of  rabbits,  where  /„  is  the 
nth  Fibonacci  number. 

•  For  a  start,  at  the  start  of  month  1  you  have  1  pair,  and  at  the  start 
of  month  2  you  still  have  just  the  1  pair.  At  the  start  of  month  3, 
though,  you  will  have  2  pairs,  and  at  the  start  of  month  4  you  will 
have  3  pairs. 
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•  In  general,  at  the  start  of  month  n  you  will  have  /„  =  /„_j  +  /„_j 
pairs  of  rabbits,  as  you  will  have  as  many  pairs  as  you  had  at  the  start 
of  month  n— 1,  namely  fn-i,  plus  a  new  pair  for  each  pair  you  had  at 
the  start  of  month  n— 2,  namely  /„_2. 

Exercise  8.11  (page  215) 

mult(m,  0)  =  0;  and 
mult(m,  s(n))  =  add(mult(m,  n),  m). 

Exercise  8.12  (page  215) 

sum{[  ])  =  0 
sum{n  :  L)  =  n  +  sum(L) 

Thus  for  example, 

sum{[ 6,  2,  5])  =  6  +  sum([ 2,  5]) 

=  6  +  2  +  s«m([5]) 

=  6  +  2  +  5  +  sum([  ]) 

—  6  +  2  +  5  +  0 
=  13. 

Exercise  8.13  (page  215) 

[}+{-  l2  —  l2 

( h  :  L )  TT  L2  —  h  :  (T  Tt-  L2). 

Exercise  8.14  (page  216) 

fv(  true)  =  fv  (false)  =  0 
fv(P(x  l,...,xn))  =  {xi,  . . .  ,xn} 
fv{^p)  =  fv(p) 

fv(p  V  q)  =  fv(p  A  q)  =  fv(p  =>  q)  =  fv(p  q)  =  fv(p)  U  fv(q) 
fv(\/xp)  =  fv{3xp)  =  fv(p)\{x} 


Exercise  8.15  (page  216) 

By  definition,  /(n)  =  n— 10  for  each  n  >  100.  Thus  we  need  only  consider 
the  value  of  /(n)  for  each  n  from  0  to  100  and  verify  that  /(n)  =  91  in  each 
case.  We  can  do  this  starting  from  n  =  100  and  working  down,  using  the 
values  we  calculate  along  the  way. 
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•  /(1 00)  =  /(/(111))  =  /(101)  =  91. 

•  /( 99)  =  /(/( 110))  =  /( 100)  =  91. 

•  /( 98)  =  /  (/ (109) )  =  /(99)  =  91. 

•  /(91)  =  / (/ (102))  =  /( 92)  =  91. 

•  /( 90)  =  /(/(101))  =  /(91)  =  91. 

•  /  (89)  =  /(/(100))  =  /(91)  =  91. 

•  / (88)  =  / (/ (99))  =  /(91)  =  91. 

•  /(l)  =  /(/(12))  =  /(91)  =  91. 

•  /( 0)  =  /(/(ll))  =  /(91)  =  91. 


Exercise  8.16  (page  218) 

( insert  a)  [  ]  —  [a] 

( insert  a)  (b  :  L)  =  if  a<b  then  a  :  (b  :  L)  else  b  :  (( insert  a)  L ) 

Exercise  8.17  (page  220) 

Moving  a  pyramid  of  n  discs  can  be  done  as  follows. 

1.  If  n=  1  then  simply  move  the  single  disc  to  the  new  peg.  Otherwise  do 
the  following. 

2.  Move  the  pyramid  of  n—1  discs  sitting  on  top  of  the  largest  disc  to  a 
different  peg. 

3.  Move  the  largest  disc  to  the  other  empty  peg. 

4.  Move  the  pyramid  of  n—1  discs  onto  the  disc  holding  the  largest  disc. 

Note  the  two  recursive  calls  in  steps  2  and  4. 

Carried  out  on  a  tower  of  five  discs,  this  would  require  31  individual 
moves. 


Chapter  9 

Exercise  9.1  (page  226) 

It  would  appear  that  the  number  of  regions  doubles  every  time  a  new  spot  is 
added,  so  it  is  tempting  to  guess  that  32  regions  will  be  created  by  connecting 
6  spots.  In  general,  our  intuition  is  suggesting  that  2"_1  regions  are  created 
by  connecting  n  spots,  based  on  the  evidence  with  n  =  1,  2,  3,  4  and  5. 
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Unfortunately  for  our  intuition,  this  guess  is  wrong:  no  matter  how  hard 
you  try,  you  can  only  create  31  regions  by  connecting  6  spots. 

In  fact,  the  formula  for  the  number  of  regions  created  by  connecting  n 
spots  is  not  2n_1,  but  the  following  rather  astonishing  formula: 

n4  —  6n3  +  23n2  —  18n  +  24 
24  ' 

Where  does  this  formula  come  from?  Starting  with  no  lines,  the  circle 
has  just  one  region.  Each  time  you  draw  a  new  line  across  the  circle,  you 
increase  the  number  of  regions  by  1  more  than  the  number  of  existing  lines 
which  this  new  line  crosses.  Thus  the  number  of  regions  is  one  more  than 
the  total  number  of  lines  added  to  the  total  number  of  intersections. 

The  number  of  lines  you  can  draw  using  n  spots  is  n(n— 1)/2,  which  is 
just  the  number  of  pairs  of  endpoints  you  can  choose  for  the  line;  and  the 
number  of  intersections  is  n(n— l)(n— 2)(n— 3)/24,  which  is  the  number  of 
pairs  of  endpoints  of  two  intersecting  lines  you  could  choose.  The  number 
of  regions  created  is  thus 

1  n(n—  1)  n(n— l)(n— 2)(n— 3) 

1  +  2  +  24 

which  simplifies  to  the  formula  given  above. 


Exercise  9.2  (page  228) 

She  would  assume  that  the  26th  child  would  confirm  that  the  first  26  num¬ 
bers  add  up  to  26x27 ,  ancj  fr0m  this  show  that  the  first  27  numbers  add 
up  to  27x28  ag  f0ii0ws: 

1  +  2  +  3+  --  -  +  27  =  1  +  2  +  3  +  --  -  +  26  +  27 


27x(f  +  1) 

27x(f 

27x28 
~ 2~ 


Exercise  9.3  (page  228) 

Young  Gauss  is  reputed  to  have  carried  out  the  following  calculation,  all  in 
his  head: 

X  1+  2+  3  +  •••  +  48  +  49  +  50 

+  100  +  99  +  98  +  •••  +  53  +  52  +  51 


101  +  101  +  101  +  •••  +  101  +  101  +  101 


50  x  101 


5050 
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That  is,  he  had  noted  that  the  sum  consists  of  50  pairs  of  numbers,  where 
each  pair  sums  to  101. 

There  are  many  stories  about  the  prodigious  Young  Gauss;  however  - 
without  taking  away  from  his  greatness  as  a  mathematician  -  his  biographers 
do  note  that  these  stories  are  mostly  attributed  to  Old  Gauss. 


Exercise  9.4  (page  231) 

1.  For  all  n  >  0,  l2  +  22  +  32  + 


2  _  ra(n+l)(2n+l) 

1  ~  - 8 - ' 


Proof:  By  induction  on  n. 


Base  Case:  We  note  that 


l2  +  22  +  32  +  •  •  •  +  02  =  0  =  °(0+1)(b2(°)+1). 
Induction  Step:  We  assume  that,  for  some  fc, 

l2  +  22  +  32  +  •  •  •  +  fc2  =  fc(fc+1K2fc+1)[ 
and  from  this  inductive  hypothesis  we  prove  that 

l2  +  22  +  32  +  •  •  •  +  k2  +  (fc+1)2  =  (fc+1)(fc+2)(2fc+3) 

That  is,  we  demonstrate  that  if  the  statement  of  the  theorem  is 
true  when  n  =  k,  then  it  must  also  be  true  when  n  =  fc+1. 

I2  +  22  +  32  +  •  •  •  +  fc2  +  (fc+1)2 

_  fc(fc+l)(2fc+l)  (j,_|_i)2  (fry  frfog  inductive  hypothesis) 

=  -^r^(fc(2fc+i)  +  6(fc+i)) 

=  ^2fc2  +  7k  +  6)) 

=  ^11((fc+2)(2fc+3)) 


(fc+l)(fc+2)(2fc+3) 

-  8 

2.  For  all  n  >  0,  1  +  3  +  5  +  •••  +  (2n— 1)  =  n2. 


□ 


Proof:  By  induction  on  n. 

Base  Case:  We  note  that 

1  +  3  +  5  +  •••  +  (2(0)  — 1)  =  0  =  02. 
Induction  Step:  We  assume  that,  for  some  fc, 

1  +  3  +  5  +  •••  +  (2fc  — 1)  =  fc2. 
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and  from  this  inductive  hypothesis  we  prove  that 

1  +  3  +  5  +  •  •  •  +  ^2(fc+l) —  =  (fc+1)2. 

That  is,  we  demonstrate  that  if  the  statement  of  the  theorem  is 
true  when  n  =  fc,  then  it  must  also  be  true  when  n  =  fc+1. 

1  +  3  +  5  +  •  •  •  +  (2  fc  —  1)  +  ^2(fc+l)  — 

=  fc2  +  ^2(fc+l)  —  1^  (by  the  inductive  hypothesis) 
=  k2  +  2k  +  1 

=  (fc  +  1)2  □ 

3.  For  all  n  >  0, 

1-2  +  2-3  +  34  +  •••  +  n(n+ 1)  =  n(n+1Xn+2)^ 

Proof:  By  induction  on  n. 

Base  Case:  We  note  that 

1-2  +  2-3  +  34  +  •••  +  0(0+1)  =  0  =  °(0+1X°+2). 
Induction  Step:  We  assume  that,  for  some  k, 

1-2  +  2-3  +  34  +  •••  +  fc(fc  ■  1)  =  fc(fc+1)(fc+2) 
and  from  this  inductive  hypothesis  we  prove  that 

1-2  +  2-3  +  34  +  •  •  •  +  (fc+l)(fc+2)  =  (fc+l)(fc+2)(fc+3). 

That  is,  we  demonstrate  that  if  the  statement  of  the  theorem  is 
true  when  n  =  k,  then  it  must  also  be  true  when  n  =  fc+1. 

1-2  +  2-3  +  34  +  •••  +  fc(fc+l)  +  (fc+l)(fc+2) 

_  k(k+iyk+2 )  (fry  the  inductive 

hypothesis ) 

=  (fe+lKfc+2)  (fc  +  3) 

_  (fc+l)(fc+2)(fc+3) 

-  - 3 -  u 

Exercise  9.5  (page  232) 

For  all  n  >  0,  F0  x  T)  x  •  •  ■  x  Fn  =  Fn+1  —  2,  where  Fn  =  22”  +  1. 

Proof:  By  induction  on  n. 

Base  Case:  For  the  base  case  [n~ 0),  we  note  that 
F0  =  3  =  5  -  2  =  F1  -  2. 
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Induction  Step:  For  the  induction  step,  we  assume  that,  for  some  k, 

F0  x  Fx  x  •  ■  ■  x  Fk  =  Fk+1  -  2 

and  from  this  assumption  (the  “inductive  hypothesis”)  we  prove  that 
F0  x  Ft  x  •  ■  ■  x  Fk+1  =  Fk+ 2  -  2. 

That  is,  we  demonstrate  that  if  the  statement  of  the  theorem  is  true 
when  n  =  k,  then  it  must  also  be  true  when  n  =  k+ 1. 

F0  x  Fj  x  F2  x  ■  ■  ■  x  Fk  x  Fk+1 

=  (Fk+1  -  2)  x  Fk+1  (by  the  inductive  hypothesis) 

=  (22‘+1  -  1)  x  (22‘+1  +  1) 

=  22‘+2  -  1  =  F*+a  -  2  □ 

Exercise  9.6  (page  232) 

For  any  real  number  1, 

1  +  r  +  r2  +  r3  +  ■■■  +  rn  =  \(_r 

for  all  n  >  0. 


Proof:  By  induction  on  n. 

Base  Case:  For  the  base  case  (n=0),  we  note  that 


1  —  r1 
1  —  r  ' 


Induction  Step:  For  the  induction  step,  we  assume  that,  for  some  k, 

,  _  1  -rk+1 


1  +  r  +  r2  + 


1  —  r 


and  from  this  assumption  (the  “inductive  hypothesis ")  we  prove  that 

.  ,  ~k+ 1  _  1  -  rfe+2 


1  +  r  +  r2  + 


1  —  r 


That  is,  we  demonstrate  that  if  the  statement  of  the  theorem  is  true 
when  n  =  k,  then  it  must  also  be  true  when  n  =  k+ 1. 

By  the  inductive  hypothesis  we  can  rewrite  the  left-hand  side  of  this 
equation  that  we  want  to  prove  true  as 

\  _ 


1  —  r 


r>k  T1 


which  we  can  successively  rewrite  as 
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\  _  rpk+l  (1  _  r^jpk+l  ^  _  rk+ 1  _|_  rk+ 1  _  rk+ 2  ^  _  r k+2 

l_r  +  1  —  r  ~  1  —  r  ~  1  —  r 

which  is  the  result  we  seek.  □ 


Exercise  9.7  (page  233) 

Drawing  n>  1  circles  so  that  any  two  intersect  at  two  points  but  no  three 
intersect  at  any  point  divides  the  plane  into  n2  —  n  +  2  regions. 

Proof:  By  induction  on  n. 

Base  Case:  One  circle  divides  the  plane  into  l2  — 1+2  =  2  regions. 

Induction  Step:  For  the  induction  step,  we  assume  that  k  circles  divides 
the  plane  into  k2  —  k  +  2  regions,  and  show  that  adding  a  (fc+l)st  circle 
results  in  (fc+l)2  —  (fc+l)  +  2  regions. 

The  (fc+l)st  circle  must  intersect  the  other  k  circles  at  2k  points, 
meaning  that  2k  regions  are  divided  into  two.  Thus,  2k  new  regions 
are  created,  giving  a  total  of  k2  —  k  +  2  +  2k  =  (fc+l)2  —  (fc+l)  +  2 
regions,  which  is  as  we  needed  to  demonstrate.  □ 


A  Venn  diagram  depicting  4  sets  would  have  to  divide  the  plane  into  16 
regions.  Therefore,  it  could  not  be  drawn  using  circles,  as  by  the  above 
result  4  circles  would  only  divide  the  plane  into  42— 4+2  =  14  regions. 

Exercise  9.8  (page  234) 

f(n)  =  n2  for  all  n  >  0,  where  f(n)  =».  <  ^ U 

’  \  /(n-l)  +  2n-  1,  if  n>0. 

Proof:  By  induction  on  n. 

Base  Case:  For  the  base  case  (n=  0),  we  simply  note  that  /( 0)  =  0  =  02. 
Induction  Step:  For  the  induction  step,  we  assume  that,  for  some  k, 
/(*-!)  =  (*- 1)2, 

and  from  this  assumption  (the  “inductive  hypothesis”)  we  prove  that 
f(k)  =  k2. 


That  is,  we  demonstrate  that  if  the  statement  of  the  theorem  is  true 
when  n  —  k  —  1,  then  it  must  also  be  true  when  n  —  k. 
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f(k )  =  f(k  —  1)  +  2k  —  1  (by  definition) 

=  (k—  l)2  +  2k  —  1  (by  the  inductive  hypothesis) 
=  k2.  □ 


Exercise  9.9  (page  235) 

Every  n  >  1  is  either  prime  or  a  product  of  primes. 


Proof:  By  strong  induction  on  n.  Suppose  that  n>  1,  and  that  for  every 
integer  k  with  1  <k<n,  k  is  either  prime  or  the  product  of  primes.  If  n  itself 
is  prime  then  we  have  nothing  to  prove,  so  suppose  that  n  =  ab  with  1  <a<n 
and  l<6<n.  By  the  inductive  hypothesis,  each  of  a  and  b  is  either  prime  or 
a  product  of  primes;  but  then  since  n  =  ab,  n  itself  is  a  product  of  primes. 


Exercise  9.10  (page  236) 

For  all  to  >  1  and  all  n  >  to,  Hn  —  Hm  >  n  ~  m . 

Proof:  We  assume  that  to  >  1  is  fixed,  and  we  prove  the  result  by  induction 
on  n. 

Base  Case  (n  =  to):  Hm  —  Hm  =  0  > 

Induction  Step:  (n  >  to): 

Hn  Hm  Hn—i  Hm  ^ 

>■  i  m  +  ff  (by  inductive  hypothesis) 

_  (n— l)n  —  mn  +  (n— 1) 

(n— l)n 


(n— l)n  —  mn  +  to 
(n— l)n 


(since  n—  1  >  m) 

□ 


Exercise  9.11  (page  236) 

Fact:  (/0)2  +  (/i)2  +  (/2)2  4 - (/„)2  =  /„/„+i  for  all  n  >  0. 

Proof:  By  induction  on  n. 


Base  Case  (n  =  0): 

(/o)2  +  (/l)2  +  (/2)2  +  •  •  •  +  (/o)2  =  (/o)2  =  02  =  0  X  1  = 
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Induction  Step  (n  >  0): 

(fof  +  (/l)2  +  (f2f  +  •  •  •  +  (fnf  +  (/„  +  l)2 

=  fnfn+ 1  +  (/n  +  l)2  (by  the  inductive  hypothesis) 

=  /n+l(/n  +  /n+l)  =  /n+l/n+2  d 

Exercise  9.12  (page  238) 

Fact:  The  quadratic  equation  y2  —  xy  —  x2  =  ±1  is  satisfied  by  the  pair 
(z,  V )  =  (A,  /n+i)  for  any  n  >  0. 

Proof:  By  induction  on  n. 

Base  Case  (n  =  0):  With  (x,  y)  =  (/0,  /x)  =  (0, 1)  we  have 
y2  —  xy  —  x2  =  l2  —  0  •  1  —  1°  =  1. 

Induction  Step  (n  >  0):  Assuming  that  (x,y)  =  ( a,b )  solves  this  equa¬ 
tion,  that  is, 

b2  —  ab  —  a2  =  ±1 

it  suffices  to  show  that  (x,  y)  =  (a+b,  b )  also  solves  this  equation;  this 
is  because  if  (a,  b)  =  (/„,  /„++)  then  (/„+ 1,  /„+2)  =  {a+b,  b ). 

(a+6)2  —  ( a+b)b  —  b2  —  a2  +  2ab  +  b2  —  ab  —  b2  —  b 2 
=  a2  +  ab  —  b2 

=  —(b2  —  ab  —  a2)  =  q=l.  □ 

Exercise  9.13  (page  238) 

Fact:  The  positive  integer  solutions  (x,  y)  to 
y2  —  xy  —  x2  =  ±1 

are  of  the  form  (/„,  /„+i)  for  some  n  >  0. 

Proof:  By  induction  on  x+y.  We  first  note  that  since  x  and  y  are  positive, 
we  must  have  that  x  <  y.  If  x  =  y  then  we  would  have  that  —x2  =  ±1,  in 
which  case  we  must  have  that  x=y= 1,  so  (x,  y)  =  (/+ ,  /2). 

We  now  assume  that  1  <  x  <  y  and  that  y2  —  xy  —  x2  =  ±1,  and  note 
that  the  pair  (a,  b )  =  (t/— x,  x)  also  satisfies  the  equation: 

b2  —  ab  —  a2  =  x2  —  (y—x)x  —  {y—x)2 

=  x2  —  xy  +  x2  —  y2  +  2  xy  —  x2 
=  -{y2  -  xy  -  x2)  =  q=l. 

By  induction,  (a,  b)  =  (/„,  fn+i)  for  some  n,  from  which  we  get 
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x  —  &  —  fn  +  l  and 
y  =  a  ~h  X  =  fn  -\-  fn+ 1  =  fn+ 2, 

so  (x,y)  =  (/„+!, /„+2).  □ 


Exercise  9.14  (page  239) 

Fact:  fl+t  -  fnfn+ 2  =  (-1)"  for  all  n  >  0. 


Proof:  By  induction  on  n. 

Base  Case  (n  =  0):  fj  —  f0f2  =  l2  —  0-1  =  1  =  (  —  1)°. 

Induction  Step  (n  >  0): 

fn+l-fnfn+2  =  fn+l  (fn  +  fn-l)  ~  fnUn+l  +  fn)  (by  definition) 
=  fn+lfn  +  /n+l/n-1  —  fnfn+1  ~  fn 
=  ~(fn  ~  fn-lfn+l) 

=  — (— l)n_1  (by  the  inductive  hypothesis) 

=  (-1)"  □ 


Exercise  9.15  (page  239) 

The  edges  that  supposedly  make  up  the  diagonal  of  the  rectangle  do  not  in 
fact  line  up.  Drawn  more  carefully,  a  gap  (or  overlap)  is  discovered  in  the 
middle  with  an  area  of  one  unit. 


5  8 


Exercise  9.17  (page  243) 

The  induction  argument  cannot  be  applied  when  n= 2:  if  S'  and  S"  are 
overlapping  sets  which  together  make  up  5,  then  either  S'  =  S  or  S"  i=  S, 
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in  which  case  you  cannot  apply  induction  to  this  set,  as  you  can  only  apply 
induction  to  sets  smaller  than  5. 

Exercise  9.20  (page  246) 

Fact:  length(L1-\+L2)  =  length{L()  +  length(L2 )  for  all  lists  Li  and  1/2- 

Proof:  By  induction  on  the  structure  of  L\. 

Base  Case  (Li  =  [  ]): 

length{[  ]-H -L2)  =  length(L2)  =  length({  ])  +  length(L2). 

Induction  Step  (L1  =  h  :  L): 
length  (( h  :  L)+\-L2) 

=  length  {h  :  (L-\+L2))  (by  definition) 

—  1  -)-  length{L-\-\-L2) 

=  1  +  length(L)  +  length(L2)  (by  the  inductive  hypothesis) 
=  length{h  :  L )  +  length(L2).  □ 


Chapter  10 

Exercise  10.1  (page  257) 

1.  By  brute  force  reasoning,  we  get  the  following  table: 


n 

i 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Kn) 

T 

~2 

"3~ 

T 

T 

~2 

"3" 

± 

T 

2 

/(n)  represents  the  number  of  coins  the  first  player  should  take  when 
there  are  n  coins  in  the  pile;  we  write  /(n)  =  ±  (meaning  /(n)  is 
undefined)  in  the  cases  in  which  the  second  player  has  the  winning 
strategy. 

2.  If  the  first  player  takes  x  coins,  the  second  player  can  respond  by 
taking  (4— a;)  coins,  leaving  the  first  player  a  pile  of  (n— 4)  coins  when 
starting  from  a  pile  of  n  coins.  This  gives  a  winning  strategy  for  the 
second  player  in  a  game  starting  with  n=4fe  coins,  that  is,  a  number 
of  coins  divisible  by  4. 

In  all  other  cases,  starting  with  n  =  4 k+x  coins  (where  x  is  1,  2,  or  3), 
the  first  player  puts  the  second  player  in  a  losing  position  by  taking  x 
coins  and  leaving  4 k  coins. 
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3.  The  first  player  is  in  a  losing  position  if  the  number  of  coins  n  is 
divisible  by  (fc+1);  the  second  player  wins  by  responding  to  every  move 
of  the  first  player  by  taking  (fc+1)—  x  coins,  where  x  is  the  number  of 
coins  that  the  first  player  takes. 

If  the  number  of  coins  n  is  not  divisible  by  (fc+1),  then  the  first  player 
wins  by  taking  n  mod  (fc+1)  coins,  leaving  the  second  player  in  a  losing 
position. 

4.  The  goal  in  the  Misere  game  is  to  leave  your  opponent  with  one  coin. 
Thus,  the  second  player  has  a  winning  strategy  when  there  are  4fc+l 
coins,  using  the  same  strategy  as  the  normal  game. 

Exercise  10.2  (page  258) 

1.  The  second  player  can  always  place  the  first  two  noughts  on  adjacent 
sides.  (The  first  nought  can  be  placed  on  a  side  which  has  both  ad¬ 
jacent  sides  empty,  and  one  of  these  will  still  be  empty  after  the  first 
player  places  the  second  cross.) 

The  third  nought  can  then  always  be  placed  so  that  it  is  aligned  with 
at  most  one  of  the  first  two  noughts  (i.e. ,  not  in  the  centre  square  nor 
in  the  corner  between  the  two  noughts).  This  is  true  because  there  are 
five  such  squares,  and  only  three  of  them  can  be  occupied  by  crosses. 

The  fourth  and  final  nought  can  then  be  placed  safely,  as  there  can 
only  be  at  most  one  square  which  could  create  a  line  of  three  noughts, 
yet  there  will  be  two  empty  squares  available  to  chose  from. 

2.  Suppose  the  first  player  places  the  first  cross  in  the  centre  and  then 
places  all  subsequent  crosses  directly  opposite  the  squares  on  which  the 
second  player  places  noughts.  If  a  line  of  three  crosses  should  arise,  it 
clearly  could  not  include  the  centre  square,  and  in  fact  would  imply 
that  there  is  a  line  of  three  noughts  already  in  place  directly  opposite 
the  line  of  three  crosses. 

Exercise  10.3  (page  258) 

Following  on  from  the  reasoning  started  in  the  question: 

•  9  o’clock  is  a  winning  position  (by  moving  3  hours  ahead  to  12  o’clock), 
and 

10  o’clock  is  a  winning  position  (by  moving  2  hours  ahead  to  12  o’clock); 

•  7  o’clock  is  a  losing  position  (as  you  can  only  move  to  a  winning  posi¬ 
tion:  either  2  hours  ahead  to  9  o’clock  or  3  hours  ahead  to  10  o’clock); 

•  4  o’clock  is  a  winning  position  (by  moving  3  hours  ahead  to  7  o’clock), 
and 

5  o’clock  is  a  winning  position  (by  moving  2  hours  ahead  to  7  o’clock); 
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•  2  o’clock  is  a  losing  position  (as  you  can  only  move  to  a  winning  posi¬ 
tion:  either  2  hours  ahead  to  4  o’clock  or  3  hours  ahead  to  5  o’clock); 

•  11  o’clock  is  a  winning  position  (by  moving  3  hours  ahead  to  2  o’clock), 
and 

12  o’clock  is  a  winning  position  (by  moving  2  hours  ahead  to  2  o’clock); 

•  8  o’clock  is  a  losing  position  (as  you  can  only  move  to  a  winning  posi¬ 
tion:  either  2  hours  ahead  to  10  o’clock  or  3  hours  ahead  to  11  o’clock); 

•  6  o’clock  is  a  winning  position  (by  moving  2  hours  ahead  to  8  o’clock); 
and 

•  3  o’clock  is  a  losing  position  (as  you  can  only  move  to  a  winning  posi¬ 
tion:  either  2  hours  ahead  to  5  o’clock  or  3  hours  ahead  to  6  o’clock). 

This  is  summarised  as  follows,  where  the  hours  on  the  clock  are  annotated 
with  the  winning  move  if  one  is  available. 


The  symbol  X  indicates  that  the  position  is  a  losing  position;  and  a  prime 
means  that  the  token  will  pass  through  the  12  o’clock  position  once  before 
landing  on  it  the  second  time  around  (assuming  the  losing  player  uses  a 
particular  strategy). 

To  see  that  this  annotation  is  correct,  it  suffices  to  note  that 

•  every  valid  move  from  an  hour  labelled  X  (i.e. ,  forward  by  either  two 
or  three  hours)  leads  to  a  position  labelled  2  or  3,  neither  with  a  prime, 
without  passing  through  12  o’clock; 

•  every  valid  move  from  an  hour  labelled  X'  leads  to  a  position  labelled 
2  or  3,  (at  least)  one  of  which  is  primed,  without  passing  through 
12  o’clock; 

•  an  hour  labelled  2  (respectively  3)  -  by  moving  forward  by  2  (respec¬ 
tively  3)  hours  -  leads  either  to  12  o’clock,  or  without  passing  through 
12  o’clock  to  an  hour  labelled  by  X; 
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•  an  hour  labelled  2'  (respectively  3')  -  by  moving  forward  by  2  (respec¬ 
tively  3)  hours  -  leads  either  to  an  hour  labelled  by  X'  without  passing 
through  12  o’clock,  or  by  first  passing  through  12  o’clock  to  an  hour 
labelled  by  X . 

Exercise  10.4  (page  259) 

In  this  game: 

•  9  is  a  losing  position,  as  the  other  player  will  have  won  by  having 
moved  the  counter  there. 

•  8  is  a  winning  position,  as  a  move  of  1  takes  the  counter  to  the  losing 
position  9. 

•  7  is  not  a  legal  position,  as  it  is  at  the  head  of  a  snake. 

•  6  is  a  losing  position,  as  moves  of  1  and  2  take  the  counter  to  the 
winning  positions  4  and  8,  respectively. 

•  5  is  not  a  legal  position,  as  it  is  at  the  foot  of  a  ladder. 

•  4  is  a  winning  position,  as  a  move  of  1  takes  the  counter  to  the  losing 
position  9. 

•  3  is  not  a  legal  position,  as  it  is  at  the  foot  of  a  ladder. 

•  2  is  a  winning  position,  as  a  move  of  1  takes  the  counter  to  the  losing 
position  6. 

•  1  is  a  winning  position,  as  a  move  of  2  takes  the  counter  to  the  losing 
position  6. 

Exercise  10.5  (page  262) 

If  we  ignore  one  of  the  piles  and  consider  the  column  parity  of  the  remaining 
n—1  piles,  this  indicates  what  size  the  final  pile  would  have  to  be  in  order  to 
balance  the  position.  (For  example,  if  the  n—1  piles  are  balanced,  then  the 
final  pile  would  have  to  be  empty;  and  if  all  n  piles  are  balanced,  then  the 
column  parity  of  any  n—1  piles  would  equal  the  size  of  the  omitted  pile.) 

Thus,  for  each  pile,  there  is  at  most  one  winning  move  involving  that  pile, 
which  consists  of  leaving  the  number  of  coins  equal  to  the  column  parity  of 
the  remaining  n—1  piles. 

Therefore,  there  are  at  most  n  different  winning  moves  possible  from  a 
Nim  position  with  n  piles. 

Exercise  10.6  (page  262) 

If  the  position  is  initially  unbalanced,  then  the  first  player  need  not  use  this 
new  move  in  order  to  win;  the  first  player  already  has  the  winning  strategy 
in  the  original  game. 
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If,  on  the  other  hand,  the  position  is  initially  balanced,  then  this  new 
move  will  still  not  help,  as  it  will  produce  an  unbalanced  position,  as  does 
any  normal  move. 

Hence  this  new  rule  gives  the  first  player  no  new  advantage. 

There  is  another  way  to  see  that  this  new  move  is  obviously  of  no  help  to 
the  first  player:  the  second  player  can  respond  to  this  new  move  by  removing 
all  of  the  coins  in  the  new  pile,  thus  putting  the  first  player  back  into  the 
same  position  as  before  the  new  move  was  made. 

Exercise  10.7  (page  264) 

For  n  =  fkl+fk2~\ - h/fcr  >  0  with  0<fci<fe2<-  ■  •  <fcr,  let  fi{n)  =  that 

is,  /r(n)  is  the  smallest  Fibonacci  number  appearing  in  the  representation  of 
n  in  the  Fibonacci  number  system.  Also,  for  convenience,  define  /r(0)  =  oo. 

Consider  the  following  two  lemmas. 

Lemma  1.  If  n> 0  then  /r(n— /r(n))  >  2/r(n). 

This  says  that  if  you  take  y(n)  coins  on  your  turn  from  a  pile  of  n  coins 
-  which,  in  particular,  the  first  person  may  do  on  their  first  move  if, 
and  only  if,  n  is  not  a  Fibonacci  number,  that  is,  /r(n)  n  -  then 
your  opponent  will  be  unable  to  do  this  in  response;  that  is,  they  will 
be  faced  with  some  number  m  =  n— /r(n)  of  coins  and  the  most  coins 
they  can  take,  namely  2/r(n),  will  be  less  than  /r(m)  =  y{n— /r(ra)). 

Lemma  2.  If  0  <  m  <  y{ri)  then  y(n—m )  <  2m. 

This  says  that  if  you  take  fewer  than  /r(n)  coins  on  your  turn  from 
a  pile  of  n  coins  -  which,  in  particular,  the  first  person  must  do  on 
their  first  move  if,  and  only  if,  n  is  a  Fibonacci  number  -  then  your 
opponent  will  be  able  to  take  y{n— m)  coins  from  the  pile  of  n—m 
coins  they  are  faced  with,  as  y{n— to)  will  be  no  more  than  than  twice 
the  number  m  of  coins  that  you  have  taken. 

Theorem  10.7  follows  directly  from  these  two  lemmas.  Given  a  pile  of  n3 
coins,  if  you  take  /r(nj)  coins  then  either  you  will  have  taken  all  coins  and 
won  the  game,  or  you  will  leave  some  number  n2  of  coins  from  which,  by 
Lemma  1,  your  opponent  cannot  take  y(n2)  coins,  and  in  particular  cannot 
take  all  of  the  remaining  coins;  and  by  Lemma  2  your  opponent  will  be 
forced  to  leave  you  with  some  number  ra3  of  coins  from  which  you  can  once 
again  use  the  strategy  of  taking  /r(n3)  coins;  the  play  will  continue  in  this 
fashion  until  you  succeed  in  taking  all  remaining  coins. 

It  remains  only  to  prove  these  two  lemmas. 

Proof  of  Lemma  1.  Let  n  =  fkl  +  fk2  +  ■  ■  ■  +  fkr  be  the  Fibonacci  number 
system  representation  of  n. 
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The  result  is  immediate  if  n  is  a  Fibonacci  number  (that  is,  if  r=l), 
as  then  n  =  /r(n),  so  /r(n— ^(n))  =  /r(0)  —  oo  >  n  —  /r(n). 

Assume,  then,  that  n  is  not  a  Fibonacci  number  (that  is,  r> 2).  Then 


pin- n(ri))  = 

> 


> 


fk2  (as  n-niri) 

fk1+ 2  (as  fci«fc2) 

/fci  +  /fci+i 
fki  +  Ai 

2  fkl 

2/r(n). 


fk2  +  •  •  •  +  /fcr) 


□ 


Proof  of  Lemma  2.  Assume  that  0  <  m  <  /r(ra).  Let  the  Fibonacci  num¬ 
ber  system  representations  of  n  and  m  be 

n  =  fk2  +  }kt  +  ■  ■  ■  +  fkr  and 
m  =  fh  +  fl%  +  ■••  +  //,  • 

By  assumption,  m  <  /r(n)  =  /fcl ,  so  f^—m  >  0.  Let  the  Fibonacci 
number  system  representations  of  fkl  —  m  be 

Ai-ra  =  /ui  +  /ti2  +  •••  +  /uf 

In  particular,  fut<fk1,  so  rit<fci<Cfc2,  and  hence  ut<^k2. 

Then 


n-m  =  +  /*2  +  A3  +  •  •  •  +  /*„ 

=  /til  +  /u2  +  •  •  •  +  /lit  •  A2  '•  A:,  . *  fkr 

and  since  -!/i<Cti2<C  ■  •  •  <Crtt<Cfc2<Cfc3<C  •  •  •  <Cfcr,  this  is  the  Fibonacci 
number  system  representation  of  n—m. 

Thus  /r(n— m)  =  fui. 

Note  that 

fki  =  Tn  +  /in  +  fU2  +  •  •  •  +  fut 

=  /<i+/£2  +  "'  +  fir  +  /til  +  /ti2  +  •  •  •  +  /lie 

Therefore  we  must  have  that  ts  ult  that  is,  u-i  <  ls  +  1,  as  otherwise 
we  would  have  two  different  Fibonacci  number  system  representations 
for  fkl.  Thus  fut  <  ft$+i,  and  hence 


□ 


/r(n-m)  =  fui  <  fL+1  =  +  fu  <  2 fu  <  2m. 
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Exercise  10.12  (page  270) 

Assume  to  the  contrary  that  the  second  player  has  successfully  created  a 
path  connecting  the  top  border  to  the  bottom  border.  At  some  point  this 
path  must  cross  the  main  diagonal,  either  horizontally  or  vertically. 

As  this  path  approaches  the  main  diagonal  from  above,  a  downwards 
move  cannot  turn  towards  the  left  border,  and  a  rightwards  move  cannot 
turn  upwards.  Hence  this  path  must  continuously  travel  downwards  and  to 
the  right.  Therefore,  it  must  reach  the  main  diagonally  vertically  and  thus 
cross  it  horizontally;  it  cannot  cross  the  diagonal  vertically. 

On  the  other  hand,  as  this  path  approaches  the  main  diagonal  from  be¬ 
low,  an  upwards  move  cannot  turn  towards  the  right  border,  and  a  leftwards 
move  cannot  turn  downwards.  Hence  this  path  must  continuously  travel  up¬ 
wards  and  to  the  left.  Therefore,  it  must  reach  the  main  diagonally  vertically 
and  thus  cross  it  vertically;  that  is,  it  cannot  cross  it  horizontally. 


Chapter  11 

Exercise  11.3  (page  286) 

At  any  given  moment,  there  must  be  some  number  i  of  missionaries  and  some 
number  j  of  cannibals  on  the  left  bank  of  the  river,  and  3 —i  missionaries 
and  3— j  cannibals  on  the  right  bank.  If  ij^j  then  the  cannibals  outnumber 
the  missionaries  on  one  of  the  banks;  this  would  only  be  safe  if  the  number 
of  missionaries  on  that  bank  is  in  fact  zero.  Hence  the  only  safe  states 
are  those  in  which  i= 3  (all  missionaries  together  on  the  left  bank)  or  i— 0 
(all  missionaries  together  on  the  right  bank)  or  i—j  (an  equal  number  of 
missionaries  and  cannibals  on  both  banks).  There  are  10  such  pairs  of 
numbers 

Apart  from  this,  the  only  information  needed  to  completely  describe  the 
state  of  the  system  is  where  the  boat  is;  it  may  be  on  the  left  bank  or  the 
right  bank.  Combined  with  the  10  possible  placements  of  the  missionar¬ 
ies  and  cannibals,  this  gives  the  system  a  total  of  20  possible  safe  states. 
However,  four  of  these  are  not  feasible.  For  a  start,  we  clearly  cannot  have 
all  the  people  on  one  bank  (i=j= 3  or  i=j— 0)  and  the  boat  on  the  other. 
Furthermore,  if  the  missionaries  are  all  on  one  bank  and  the  cannibals  are 
all  on  the  other  ({i,  j}  =  {0,  3})  then  the  boat  must  be  with  the  cannibals; 
if  it  were  with  the  missionaries,  then  one  or  two  of  them  must  have  just 
ferried  it  across  the  river  from  the  bank  on  which  all  three  cannibals  are, 
which  would  have  been  an  unsafe  position. 

The  remaining  16  states  are  depicted  in  Figure  15.4,  along  with  the  possi¬ 
ble  transitions  between  states  drawn  in.  (To  avoid  clutter,  the  transitions  are 
drawn  bi-directionally,  as  they  all  represent  reversible  actions.)  The  groups 
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Figure  15.4:  The  LTS  of  the  Missionaries  and  Cannibals  Riddle. 


on  the  two  banks  are  depicted  side-by-side  divided  by  wiggly  lines  repre¬ 
senting  the  river,  with  the  group  holding  the  boat  enclosed  in  parentheses. 
There  are  five  possible  actions:  m  (a  missionary  crosses  alone);  mm  (two 
missionaries  cross  together);  c  (a  cannibal  crosses  alone);  cc  (two  cannibals 
cross  together);  and  me  (a  missionary  and  a  cannibal  cross  together).  Notice 
that  all  of  the  transitions  are  drawn  bi-directionally,  as  every  transition  can 
clearly  be  reversed. 

The  group  start  in  the  top-left  state  in  which  the  whole  group  is  on  the 
left  bank,  and  they  wish  to  get  to  the  bottom-right  state  in  which  they  are 
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all  on  the  right  bank.  It  is  not  hard  to  find  a  such  path  through  the  LTS 
which  involves  11  crossings. 


Exercise  11.4  (page  287) 


A  state  of  the  system  underlying  this  riddle  consists  of  a  pair  of  integers 
( i,j )  with  0<i<5  and  0<j<3,  representing  the  volume  of  water  in  the  5- 
gallon  and  3-gallon  jugs  A  and  B,  respectively.  The  initial  state  is  (0,  0)  and 
the  final  state  you  wish  to  reach  is  (4,  0). 

There  are  six  types  of  moves  possible  from  any  given  state  ( i,j ): 


(iJ) 

(iJ) 

(iJ) 

(iJ) 

(iJ) 

(iJ) 


fill A 
fillB 
empty  A 
emptyB 
Atof! 
Bto 4 


(5,J) 

(*.  3) 

(0,7) 

(*.  o) 

( max(0,  i+j—3),  min(3,  i+j)) 
( min(5,  i+j),  max(0,  i+j— 5)) 


(if  i=  0) 


(if  j= 0) 


(ifi>  0) 

(if  j>  0) 

(if  i> 0  and  j< 3) 
(if  i<  5  and  j> 0) 


Drawing  out  the  LTS,  we  identify  the  following  7-step  solution: 

(0, 0)  (5, 0)  40$  (2, 3)  (2, 0)  A-!A  (0, 2) 


^(3.  2)  ^(4,  3)' 


mptyB 


(4,  0). 


Exercise  11.5  (page  287) 

The  beer  mats  must  start  in  one  of  the  following  three  non-winning  config¬ 
urations: 

X:  3  one  way,  the  4th  the  other  way. 

Y:  2  face-up  and  2  face-down,  with  diagonally-opposite  corners  different. 
Z:  2  face-up  and  2  face-down,  with  diagonally-opposite  corners  the  same. 

Furthermore,  there  are  only  three  different  moves  which  you  may  apply  to 
the  beer  mats: 

a:  flip  one  beer  mat. 

b:  flip  two  adjacent  beer  mats. 

c:  flip  two  diagonally-opposite  beer  mats. 

(Flipping  3  beer  mats  has  the  same  effect  on  the  possible  configurations  as 
flipping  1  beer  mat;  and  flipping  all  four  beer  mats  has  no  effect  whatsoever 
on  the  configuration.) 
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In  the  following  table,  we  indicate  which  non- winning  configurations  we 
may  go  to  from  each  non- winning  configuration. 


For  example,  from  an  X-configuration,  an  a-move  (flipping  one  beer  mat) 
could  lead  to  a  winning  configuration,  or  to  either  a  Y-configuration  or  a 
Y-configuration;  and  from  a  Y-configuration,  a  c-move  (flipping  diagonally- 
opposite  beer  mats)  is  guaranteed  to  lead  to  a  winning  configuration. 

We  can  use  a  labelled  transition  system  to  keep  track  of  which  configura¬ 
tions  we  may  be  in  at  any  given  time  assuming  that  we  have  never  passed 
through  a  winning  configuration.  The  LTS  looks  as  follows. 


Here,  we  start  in  the  state  labelled  “XYZ”  signifying  that  we  don’t  know 
which  state  X,  Y  or  Z  we  are  in.  If  we  do  an  a  move  or  a  b  move,  then  we 
may  still  be  in  any  of  these  states;  however,  if  we  do  a  c  move,  then  we  will 
know  that  we  cannot  be  in  a  Z  state. 

Prom  this  we  can  see  that  the  shortest  sequence  of  moves  which  guaran¬ 
tees  a  win  is  the  sequence  “cbcacbc”  of  seven  moves. 


Exercise  11.6  (page  288) 

1.  There  are  six  states.  In  the  graphical  presentation  of  the  transition 
system,  these  are  represented  by  the  following  (x,  y)-v alued  pairs: 

fx  =  246\  fx  =  72\  (x  =  l2\ 

U  =  174J  \y  =  30  J  U  = 6  J 

fx  =  72\  (x  =  12\  (x  =  0\ 

\v  =  174;  \y  =  30  J  [y  =  6j 

(Of  course,  how  the  states  are  labelled  is  irrelevant.) 
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2.  There  are  two  actions,  namely  “x:  =  xmody”  and  “y:  =  ymodx”. 

3.  There  are  five  transitions.  Labelling  the  states  as  above,  these  transi¬ 
tions  are: 


x  :  =  xmody 


y  :  =  ymodx 


x  :  =  xmody 


y  :  =  ymodx 


x  :  =  xmody 


Exercise  11.7  (page  289) 


Exercise  11.8  (page  290) 


The  above  process  has  two  states:  Cl*  and  Cl0;  one  action:  tick;  and  two 
transitions:  Cl*  Cl*  and  Cl*  Cl0. 

Exercise  11.9  (page  291) 

As  described,  a  state  is  a  triple  (/,  d,  S)  where 
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f  e  {  1,  2*,  2-*-,  3,  1  +  ,  2+,  2  ,  3“}; 
d  8  {open,  closed};  and 
5  C  {1,  21,  2-,  3}. 

There  are  11  actions  that  the  system  can  possibly  do.  Firstly,  any  of  the 

call  buttons  can  be  pressed  on  any  of  the  floors: 

Pi :  (up)  button  on  floor  1  is  pressed; 
p2 1 :  up  button  on  floor  2  is  pressed; 
p2i :  down  button  on  floor  2  is  pressed; 
p3 :  (down)  button  on  floor  3  is  pressed. 

Next,  any  of  the  floor  buttons  can  be  pressed  in  the  elevator: 
ex :  floor  1  button  is  pressed  in  the  elevator; 

e2 :  floor  2  button  is  pressed  in  the  elevator; 

e3 :  floor  3  button  is  pressed  in  the  elevator. 

Next,  the  elevator  door  can  open  or  close: 
op :  the  elevator  door  opens; 

cl :  the  elevator  door  closes. 


Finally,  the  elevator  can  move: 
up :  the  elevator  moves  up; 
dn:  the  elevator  moves  down. 


Exactly  when  each  of  these  actions  can  occur,  and  their  effect  on  the  state 
of  the  system,  is  detailed  as  follows. 

Firstly,  any  button  can  be  pressed  on  any  of  the  floors  at  any  time.  If 
the  elevator  is  at  that  floor  with  its  door  open  and  travelling  in  the  right 
direction,  then  the  button  press  will  have  no  effect  on  the  state;  otherwise, 
the  floor,  tagged  by  the  requested  direction,  will  be  added  to  the  destination 
list: 


(/,  d,  S)  -^4  (/,  d,  S'),  where  S' 


(S,  if  /=&  and  d=open 

S  U  {6},  otherwise 


Next,  any  button  can  be  pressed  in  the  elevator  at  any  time.  If  the  elevator 
is  at  the  floor  being  requested  with  its  door  open,  then  the  button  press  will 
have  no  effect  on  the  state;  otherwise,  the  floor  being  requested,  tagged  by 
the  direction  to  get  to  the  requested  floor  (or  the  current  direction  being 
travelled  if  the  elevator  is  at  that  floor),  will  be  added  to  the  destination 
list: 


(/,  d,  S)  -^4  {/,  d,  S'),  where 
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IS,  if  /= 1  and  d=open 

5u{l},  otherwise 
d,  S'),  where 

'5,  if  /  6  {2^,  2-i}  and  d=open 

s  U{2'r},  if /e  {1,1+2-} 

,  or  /=2f  and  d=dosed 

S  U  {2-i},  if  fe  {3,  3“,  2+} 

or  /= 2^  and  d=dosed 

d,  S'),  where 

IS,  if  /= 3  and  d=open 

S  U  {3},  otherwise 

Next,  the  door  can  either  open  or  close  at  appropriate  times: 

{/,  open,  S)  -A  {/,  closed,  5); 

{/,  closed,  S)  -4  (f,  open,  S  \  {/}),  if  /  e  5; 

{2^,  dosed,  S)  -4  <2^,  open,  S  \  {2*}),  if  2^  e  5  and  2^,  3^5; 
{2'*',  closed,  S)  — 4  (2^,  open,  S  \  {21’}),  if  2^  £  5  and  2-i,  1^5. 
Finally,  the  elevator  can  move  as  and  when  appropriate: 

{1,  closed,  S)  ^4  (1+,  closed,  S),  if  1  £  S  and  5  ±  0; 

(2f,  closed,  {3})  —4  {2+,  closed,  {3}); 

{2-i,  closed,  {3})  — 4  (2+,  closed,  {3}); 

{1+,  closed,  S)  —4  (1  +  ,  closed,  S); 

(1+,  closed,  S)  — 4  {2"i,  closed,  5); 

(2+,  closed,  S)  —4  (2+,  closed,  5); 

(2+,  closed,  S)  — 4  (3,  closed,  5); 

(3,  dosed,  5)  (3-,  closed,  S),  if  3  £  S  and  5  ^  0; 

(2-i,  closed,  {1})  -4*  {2_,  closed,  {1}); 

(2t,  closed,  {1})  — {2_,  closed,  {1}); 

(3“,  closed,  S)  — (3_,  closed,  5); 

(3“,  dosed,  5)  (21,  dosed,  5); 


5'  = 


</,  d,  5)  ^4  {/, 


5'  = 


</,  d,  5)  —4  {/, 
5'  = 
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(2~,  closed,  S)  {2“,  closed,  S)\ 

{2“,  closed,  S)  (1,  closed,  S); 

Exercise  11.10  (page  291) 

The  two  models  for  flipping  coins  are  as  follows: 


As  required,  the  outcome  is  determined  in  the  first  model  already  when  the 
coin  is  tossed:  the  system  will  either  be  in  a  state  in  which  it  can  do  a  heads 
action  and  not  a  tails  action,  or  it  will  be  in  a  state  in  which  it  can  do  a 
tails  action  and  not  a  heads  action.  In  contrast  to  this,  when  the  coin  is 
tossed  in  the  second  model,  the  system  will  be  in  a  state  in  which  it  can  do 
either  a  heads  action  or  a  tails  action. 

Which  model  is  more  realistic?  We  might  introduce  quantum  mechanics 
and  allude  to  the  fate  of  Schrodinger’s  cat  placed  in  a  sealed  box  with  a 
flask  of  poison,  a  radioactive  source,  and  a  mechanism  which  will  shatter  the 
flask  -  releasing  the  poison  and  killing  the  cat  -  if  a  Geiger  counter  detects 
a  radioactive  particle;  according  to  quantum  mechanics,  after  a  while  the 
cat  will  be  simultaneously  dead  and  alive  until  we  open  the  box;  only  by 
observing  the  cat  will  its  fate  be  sealed.  With  this  in  mind,  we  might  choose 
the  second  model  to  be  more  realistic. 

Barring  the  complexities  of  Schrodinger’s  cat,  the  first  model  is  more 
realistic,  in  that  it  enforces  the  principle  that  the  toss  itself  decides  the  fate 
of  the  coin;  having  tossed  the  coin,  and  with  it  resting  on  the  back  of  one 
hand  shielded  from  view  by  the  palm  of  the  other,  no  further  forces  can 
influence  the  outcome  of  the  coin  flip.  The  coin  is  decidedly  showing  heads 
or  tails. 

We  can  contrast  this  situation  with  the  model  of  the  simple  vending 
machine  from  page  282  which  accepts  a  50p  coin  and  allows  the  user  to 
decide  whether  to  press  a  coffee  button  or  a  tea  button.  Having  inserted  the 
50p  coin,  the  user  is  completely  free  to  choose  which  button  to  press,  and 
thus  the  model  for  the  vending  machine  closely  resembles  the  second  coin¬ 
flipping  model  above.  Such  a  free  choice  is  of  course  undesirable  in  a  coin 
flip.  (It  would  be  equally  undesirable  for  the  vending  machine  to  eliminate 
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the  user’s  free  choice  of  drinks  when  the  50p  coin  is  inserted,  as  with  the 
first  coin-flipping  model  above.) 


Exercise  11.11  (page  296) 


By  the  rule  for  action  prefix, 


prill.  Broken 


pull 


Broken  and 


reset.Owe  off. 


Hence,  by  the  rule  for  choice, 
PuH.Broken  +  reset.  Off 


pull 


Broken 


reset 

PuH.Broken  +  reset.OFF  - >  Off. 


and 


As  Broken  =  pull. Broken  +  reset.OFF,  the  rule  for  Process  Variables 
gives  us  to  infer  our  result: 
pull 

Broken  — 1  Broken  and 
Broken  off. 


Exercise  11.12  (page  297) 

def 

Cl*  =  tick. Cl*  +  tick.Clo- 

Exercise  11.13  (page  298) 

0; 

i; 

n  <  4; 
n  <  20. 

2.  The  transition  diagram  is  depicted  in  Figure  15.5. 

Exercise  11.14  (page  300) 

The  five  states  of  the  second  vending  machine  are: 

lOp. coffee . collect . V2  +  1 Op. tea. collect .  V) 
coffee  .collect .  V2 
tea.  collect .  V2 
collect . V2 


i5.cs 

ZIO.C10  +  Z20.C720) 

if 

n  = 

di.Co, 

if 

n  = 

di.C^ 

+  d2.Cn_2, 

if 

2  < 

dl.Cn — 1 

+  d2.Cn—2  +  db.Cn—S) 

if 

5  < 

The  six  states  of  the  third  vending  machine  are: 


^3 

lOp.  coffee .  collect .  V3 
lOp.  tea.  collect .  V2 
coffee  .collect .  V2 
tea. collect . V3 
collect .  V3 


Exercise  11.15  (page  301) 

No  matter  how  we  do  a  lOp  action, 
we  must  end  up  in  a  state  in  which 
we  may  do  a  1  Op  action 
and  end  up  in  a  state  in  which 
we  may  do  a  tea  action. 
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Figure  15.6:  The  LTS  for  Exercise  11.16. 


Exercise  11.16  (page  301) 

1.  The  transition  system  is  depicted  in  Figure  15.6. 

2.  From  state  C  you  may  do  an  a  action  and  be  in  a  state  in  which,  no 
matter  how  you  do  a  b  action  you  will  either  not  be  able  to  do  a  c 
action  or  you  will  not  be  able  to  do  a  d  action. 

On  the  other  hand,  from  state  D  no  matter  how  you  do  an  a  action, 
you  will  be  able  to  do  a  b  action  and  end  up  in  a  state  in  which  you 
can  both  do  a  c  action  as  well  as  ad  action. 

Exercise  11.17  (page  302) 

The  given  definition  of  equality  is: 

reflexive:  Clearly  E  A  G  if,  and  only  if,  E  A  G,  so  E  =  E. 

symmetric:  Suppose  that  E  =  F.  To  show  that  F  =  E  we  need  to 
demonstrate  that  F  A  G  if,  and  only  if,  E  A  G.  However,  this 
must  be  true,  as  this  is  exactly  the  same  criterion  that  makes  E  =  F, 
namely  that  E  A  G  if,  and  only  if,  FfiG. 

transitive:  Suppose  that  E  =  F  and  F  =  G.  To  show  that  E  =  G  we 
need  to  demonstrate  that  E  A  H  if,  and  only  if,  G  A  H.  However, 

E  A  H  if,  and  only  if,  F  A  H  (since  E  =  F) 

if,  and  only  if,  G  A  H  (since  F  —  G). 

Exercise  11.18  (page  303) 

The  only  way  to  infer  that  A  =  A0  would  be  to  first  show  that  a. A  =  a.Alt 

which  would  require  us  to  show  that  A  =  A\. 


470  Temporal  Properties 


However,  the  only  way  to  infer  that  A  =  A1  would  be  to  first  show  that 
a. A  =  a.A2,  which  would  require  us  to  show  that  A  =  A2. 

Likewise,  the  only  way  to  infer  that  A  =  A2  would  be  to  first  show  that 
a. A  =  a.A3,  which  would  require  us  to  show  that  A  =  A3. 

Continuing  in  this  fashion,  we  would  never  finish,  and  so  we  could  never 
reach  our  goal  of  inferring  that  A  =  A0. 


Chapter  12 

Exercise  12.2  (page  312) 

Fact:  For  any  games  Gn(E,  F),  either  the  first  player  has  a  winning  strat¬ 
egy,  or  the  second  player  has  a  winning  strategy. 


Proof:  By  induction  on  n. 

For  the  base  case,  the  second  player  clearly  has  a  winning  strategy  for 
any  game  G0(E,  F )  of  length  n— 0. 

For  the  inductive  case,  assume  that  for  any  game  Gn(E',  F')  of  length  n, 
either  the  first  player  has  a  winning  strategy,  or  the  second  player  has  a 
winning  strategy.  Suppose  then  that  the  following  two  properties  hold: 

•  for  all  actions  a  and  all  states  E',  if  E  -A  E'  then  F  A  F'  for 
some  state  F'  such  that  the  second  player  has  a  winning  strategy  for 
the  game  Gn(E',  F');  and 

•  for  all  actions  a  and  all  states  F1,  if  F  A  F'  then  E  -A  E1  for 
some  state  E'  such  that  the  second  player  has  a  winning  strategy  for 
the  game  Gn(E',  F'). 

That  is,  suppose  that  no  matter  what  the  first  player  does  as  her  first  move 
in  the  game  Gn+1(E,  F )  -  either  a  move  E  A  E'  or  a  move  F  A  F'  - 
the  second  player  can  respond  in  such  a  way  that  he  gets  into  a  position 
in  which  he  has  a  winning  strategy  in  the  game  of  length  n.  This  clearly 
defines  a  winning  strategy  for  the  second  player  in  the  game  Gn+1(E,  F). 

Hence,  if  the  second  player  does  not  have  a  winning  strategy  in  the 
game  Gn+1(E,  F ),  then  one  of  the  above  two  properties  fails  to  hold.  That 
is,  either 

•  E  -A  E'  in  such  a  way  that  whenever  F  -A  F'  the  second  player 
does  not  have  a  winning  strategy  in  the  game  Gn(E',F')]  but  then 
by  the  inductive  hypothesis,  this  implies  that  the  first  player  has  a 
winning  strategy  in  the  game  Gn(E',  F'),  which  means  she  can  use  the 
E  A  E'  transition  as  the  first  move  in  a  winning  strategy  for  the  game 
G„+i(E,  F);  or 
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•  F  A  F'  in  such  a  way  that  whenever  E  A  E'  the  second  player 

does  not  have  a  winning  strategy  in  the  game  Gn(E',F ');  but  then 
by  the  inductive  hypothesis,  this  implies  that  the  first  player  has  a 
winning  strategy  in  the  game  Gn(E',  F'),  which  means  she  can  use  the 
F  A  F'  transition  as  the  first  move  in  a  winning  strategy  for  the  game 
Gn+1(E,F).  □ 

Exercise  12.3  (page  313) 

C  ~2  D  (and  hence  C  ~0  D  and  C  D )  since  from  either  C  or  D  you 

can  do  an  a  action  and  nothing  else,  and  regardless  of  where  that  takes 
you,  you  will  be  able  to  do  a  b  action  and  nothing  else.  Therefore  the  sec¬ 
ond  player  can  obviously  copy  whatever  two  moves  the  first  player  makes 
in  the  Bisimulation  Game  when  the  tokens  start  on  the  pair  of  states  ( C ,  D). 

C  9^3  D  (and  hence  C  D  for  all  n> 3)  since  the  first  player  has  a 
strategy  which  will  win  her  the  game  within  three  moves  starting  from  the 
pair  of  states  (C,  D ): 

•  For  her  first  move  she  can  do  C  A  A,  to  which  the  second  player 
would  have  to  respond  with  D  A  B;  the  two  tokens  will  then  be  on 
the  pair  of  states  (A,  B). 

•  For  her  second  move  she  could  then  do  B  -A  c.O  +  d.O,  to  which 
the  second  player  would  have  to  respond  with  either  with  A  — >  c.O 
or  with  A  -A  d.O;  the  two  tokens  will  then  be  on  the  pair  of  states 
(c.O,  c.O  +  d.O)  or  the  pair  of  states  (d.O,  c.O  +  d.O). 

•  For  her  third  move,  if  the  tokens  are  on  the  pair  of  states  (c.O,  c.O  +  d.O) 
then  she  should  do  c.O  +  d.O  — >  0,  and  if  the  tokens  are  on  the  pair  of 
state  (d.O,  c.O  +  d.O)  then  she  should  do  c.O  +  d.O  0;  in  either  case 
the  second  player  will  not  be  able  to  respond. 

Exercise  12.4  (page  315) 

Fact:  For  all  n  G  N,  and  for  all  states  E,  F,  and  G,  if  E  ~n  F  and  F  G 
then  E  G. 


Proof:  By  induction  on  n  6  N. 

For  the  base  case  n= 0,  Theorem  12.3(1)  gives  us  that  E  ~0  G. 

For  the  induction  step,  we  assume  that  E  ~n+1  F  and  F  ~n+1  G.  Re¬ 
ferring  to  the  pictorial  representations  of  Theorem  12.3  (page  313),  for  the 
induction  step  the  argument  will  be  based  on  the  following  picture: 
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Suppose  that  the  first  player  makes  a  transition  E  A  E'.  By  Theo¬ 
rem  12.3(2),  since  E  ~n+1  F  we  have  that  F  A  F'  for  some  F'  such  that 
E'  F'\  and  hence  again  by  Theorem  12.3(2)  since  F  ~n+1  G  we  have  that 
G  A  G'  for  some  G'  such  that  F1  G' .  Thus,  by  induction  E1  ~n  G'.  In 
summary, 

•  If  E  A  E'  then  G  A  G'  for  some  G'  such  that  E1  G' . 

Suppose  instead  that  the  first  player  makes  a  transition  G  A  G' .  By  The¬ 
orem  12.3(2),  since  F  ~n+1  G  we  have  that  F  A  F'  for  some  F'  such  that 
F'  G';  and  hence  again  by  Theorem  12.3(2)  since  E  ~n+1  F  we  have  that 
E  A  E1  for  some  E'  such  that  E'  F'.  Thus,  by  induction  E'  G'.  In 
summary, 

•  If  G  A  G'  then  E  A  E1  for  some  E'  such  that  E'  G'. 

These  two  bullet  points,  by  Theorem  12.3(2),  gives  us  that  E  ~n+1  G.  □ 


Exercise  12.5  (page  315) 

Fact:  For  all  n  e  N,  Cl„  Cl,  while  Cl„  /B+1  Cl. 

Proof:  We  can  show  the  equivalence  by  induction  on  n. 

Base  Case:  Cl0  ~o  Cl,  by  Theorem  12.3(1). 

Induction  Step:  Assuming,  for  some  n,  that  Cl„  Cl,  we  can  conclude 
from  Theorem  12.3(2)  that  Cln+i  ~„+i  Cl 

The  inequivalence  follows  from  noting  that  in  the  bisimulation  game  played 
with  the  tokens  on  Cl„  and  Cl,  after  an  exchange  of  n  moves  the  tokens  will 
necessarily  be  on  Cl0  and  Cl,  and  the  first  person  will  be  able  to  make  the 
move  Cl  -A>  ci  which  the  second  person  cannot  match.  □ 


Fact:  For  all  n  G  N,  Clock  Clock*,  while  Clock  Clock*. 
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Proof:  We  can  firstly  note  that  if  the  first  player  is  to  have  any  chance 
of  winning  the  bisimulation  game  with  the  tokens  on  the  states  Clock  and 
Clock*,  then  he  must  start  with  the  move  Clock*  — — >  Cl;  the  second  player 
could  respond  to  any  other  opening  move  in  such  a  way  as  to  leave  the  two 
tokens  on  the  same  state,  leaving  him  with  the  obvious  copycat  strategy  to 
win. 

In  response  to  this  opening  move  Clock*  Cl,  the  second  player 

can  move  Clock  -^A  Cl„;  and  since  (from  above)  Cl„  Cl  for  all  n,  by 
Theorem  12.3(2)  we  can  deduce  that  Clock  Clock*  for  all  n. 

The  inequivalence  follows  from  the  fact  (from  above)  that  Cl„  7^n+i  Cl. 

□ 


Exercise  12.6  (page  317) 

To  prove  that  77  is  a  bisimulation  relation,  we  need  to  demonstrate  that  the 
bisimulation  property  from  Definition  12.5  holds  of  each  of  the  five  pairs  of 
states  related  by  77. 

•  {Pit  Qi )  £  77: 

-  Pi  -A  P2  is  matched  by  Qi  -A  Q2,  and  vice  versa,  as  ( P2 ,  Q2)  £  77. 

-  Pi  A  P3  is  matched  by  Qi  -A  Q3,  and  vice  versa,  as  (P3,  Q3)  £  77. 

•  {Pit  Qi)  £  P'- 

-  P2  -A  P3  is  matched  by  Q2  -A  Q3,  and  vice  versa,  as  (P3,  Q3)  £  77. 

•  {Pit  Qi)  £  P'- 

-  P2  -A  P3  is  matched  by  Q 4  -A  Qs,  and  vice  versa,  as  (P3,  Q5)  e  77. 

•  (-P31  Q3)  £ 

-  P3  -A  is  matched  by  Q3  -A  Qu  and  vice  versa,  as  (P3,  Qx)  e  77. 

-  P3  -A  P2  is  matched  by  Q3  -A  Q4,  and  vice  versa,  as  (P2,  Q4)  £  77. 

•  (-P3,  Qs)  £  77: 

-  P3  -A  P4  is  matched  by  Q5  -A  Q4,  and  vice  versa,  as  (P4,  Q4)  £  77. 

-  P3  -A  P2  is  matched  by  Q5  -A  Q2,  and  vice  versa,  as  (P2,  Q2)  £  77. 

Exercise  12.7  (page  317) 

Assume  that  77  and  5  are  bisimulation  relations  over  the  states  of  a  labelled 
transition  system,  and  that  {E,  G)  £  77  o  5.  This  means  that  ESF  and 
P77G  for  some  state  F . 

•  If  E  -A  E1,  then  from  ESF  we  get  that  F  -A  P'  for  some  F'  such  that 
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E'SF';  and  thus  from  FTZG  we  get  that  G  A  G'  for  some  G'  such 
that  F'TZG'  and  hence  such  that  (E1,  G ')  e  TZo  S. 

•  If  G  A  G' ,  then  from  FTZG  we  get  that  F  A  F1  for  some  F'  such 
that  F'TZG']  and  thus  from  ETZF  we  get  that  E  A  E'  for  some  E' 
such  that  E'TZF'  and  hence  such  that  (E1,  G ')  E  TZo  S. 

Thus  TZ  o  S  is  a  bisimulation  relation. 

Exercise  12.8  (page  318) 

Let  E  -<n  F  (where  n  may  be  oo)  mean  that  the  second  player  has  a  winning 
strategy  in  the  n-round  game  in  which  the  first  player  must  always  move 
the  token  which  starts  on  state  E.  Then  clearly  xn  =  -<n  n 

(a)  x„  is  an  equivalence  relation,  as  it  is: 

reflexive:  If  both  tokens  are  on  the  same  node,  then  the  second 
player  has  the  obvious  winning  strategy  of  following  the  lead  of  the 
first  player,  copying  each  move. 

symmetric:  This  follows  from  the  fact  that  x„  =  -<n  n  -<“1. 

transitive:  We  first  show,  by  induction  on  n,  that  E  -<„  G  whenever 
E  -<n  F  and  F  -<„  G.  If  n  =  0  then  we  immediately  have  that 
E  G,  so  suppose  n  =  fe+1,  and  suppose  that  the  first  player  makes 
a  transition  E  A  E']  then  we  must  have  that  F  A  F'  with  E1  -<n  F' , 
and  thus  we  have  that  G  A  G'  with  F'  -<n  G';  hence  by  induction 
E  -< n+1  G. 

To  demonstrate  that  is  transitive,  we  modify  Definition  12.5  (page 
316)  to  define  a  simulation  relation  to  be  a  binary  relation  TZ  over 
states  which  satisfies  the  following  property:  if  ETZF  then 

-  if  E  A  E'  then  F  A  F1  for  some  F'  such  that  E'TZF'. 

We  then  rephrase  Theorem  12.6  (page  316)  as 

The  second  player  has  a  winning  strategy  in  an  infinite  simulation 
game  with  the  tokens  starting  on  states  E  and  F  if,  and  only 
if,  ETZF  for  some  simulation  relation  TZ.  Hence  in  particular, 
TZ  C  for  any  simulation  relation  TZ. 

The  proof  of  this  result  is  completely  analogous  to  that  for  Theo¬ 
rem  12.6.  Our  result  then  follows  by  showing  that  TZo S  is  a  simulation 
relation  whenever  TZ  and  S  are:  this  is  shown  as  for  the  solution  to 
Exercise  12.7. 

Assume  then  that  TZ  and  S  are  simulation  relations,  and  that  (E,  G)  E 
TZo  S.  This  means  that  ESF  and  FTZG  for  some  state  F. 


Additional  Exercises  475 


If  E  A  E',  then  from  ESF  we  get  that  F  A  F'  for  some  F'  such  that 
E'SF';  and  thus  from  F1ZG  we  get  that  G  -A  G'  for  some  G1  such 
that  F'TZG'  and  hence  such  that  (E1,  G ')  eRo  5. 

Thus  TZ  o  S  is  a  simulation  relation. 

(b)  If  the  second  player  has  a  winning  strategy  in  the  bisimulation  game, 
then  he  can  use  this  same  strategy  to  win  the  new  game.  The  new 
game  only  restricts  the  possible  moves  of  the  first  player 

(c)  It  is  easily  verified  that  a. 6.0  a.6.0  +  a.0,  while  a. 6.0  x2  a.6.0  +  a.0. 

Exercise  12.9  (page  321) 

Suppose,  by  way  of  contradiction,  that  the  first  transition  system  of  Fig¬ 
ure  12.1  is  coloured  with  a  bisimulation  colouring  which  assigns  the  same 
colour  to  the  two  states  X  and  U.  Since  U  A  V,  there  must  be  an  a-labelled 
transition  out  of  X  leading  to  a  state  with  the  same  colour  as  V.  This  state 
must  be  Y,  and  hence  V  and  Y  must  have  the  same  colour  in  this  supposed 
bisimulation  colouring.  But  then  since  Y  Z,  there  must  be  a  c-labelled 
transition  out  of  V  leading  to  a  state  with  the  same  colour  as  Z.  However, 
there  is  no  such  state,  of  any  colour,  which  provides  us  with  our  desired 
contradiction. 

Exercise  12.10  (page  322) 

Consider  the  first  transition  system  of  Figure  12.1. 


The  initial  all-white  colouring  is  not  a  bisimulation  colouring,  as  the  white 
states  V  and  Y  have  6-transition  to  white  states,  whereas  the  other  white 
states  U,  W,  X  and  Z  do  not  have  6-transitions  to  white  states.  Hence,  by 
the  invariant,  states  V  and  Y  cannot  be  equivalent  to  the  other  white  states; 
in  any  bisimulation  colouring,  states  V  and  Y  must  each  have  a  different 
colour  from  states  U,  W,  X  and  Z.  Hence  we  may  safely  refine  our  colouring 
by  making  states  V  and  Y  a  different  colour  (gray,  say). 
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This  is  still  not  a  bisimulation  colouring,  as  the  gray  states  Y  has  a  c- 
transitions  to  a  white,  whereas  the  other  white  state  V  does  not.  Hence,  by 
the  invariant,  states  V  and  Y  cannot  be  equivalent,  and  so  we  may  safely 
refine  our  colouring  by  making  Y  a  different  colour,  say  gray-on-black.  At 
the  same  time  we  can  note  that  the  white  state  W  has  a  c-transitions  to  a 
white,  whereas  the  other  white  states  U,  X  and  Z  do  not,  and  so  we  can 
safely  make  W  a  different  colour,  say  gray-on  white. 


Again  this  is  still  not  a  bisimulation  colouring,  as  the  white  states  U  has 
an  a-transitions  to  a  gray  state,  whereas  the  other  white  states  X  and  Z 
do  not;  and  the  white  state  X  has  an  a-transition  to  a  gray-on-black  state, 
whereas  the  other  white  states  U  and  Z  do  not.  Hence,  we  may  safely  refine 
our  colouring  by  making  X  a  different  colour,  say  black,  and  Z  a  different 
colour,  say  black-on-gray. 


This  colouring  is  a  bisimulation  colouring,  which  by  construction  satisfies 
our  invariant.  That  it  is  a  bisimulation  colouring  is  clear,  since  there  are  no 
two  states  with  the  same  colour. 
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Exercise  12.11  (page  326) 

Take: 

E0  =f  Clock  F0  =f  Clock* 

En+i  =  tick.En  Fn+1  dAf  tick.Fn 

By  a  straightforward  induction  argument  we  can  show  that,  for  each  n  G  N, 
En  blit  En  '7^'a;+n  +  l  • 

Chapter  13 

Exercise  13.1  (page  336) 

1.  (coffee)true  says: 

we  may  do  a  ‘coffee’  action  and  end  up  in  a  state  in  which 
true  is  true. 

This  property  will  be  true  if  it  is  possible  to  do  a  ‘coffee’  action. 

2.  (cof f ee)false  says: 

we  may  do  a  ‘coffee’  action  and  end  up  in  a  state  in  which 
false  is  true. 

Such  a  ‘coffee’  action  cannot  therefore  be  possible,  as  false  could  not 
possibly  be  true  in  any  subsequent  state.  Therefore  this  property  can 
never  be  satisfied;  it  is  equivalent  to  the  property  false. 

3.  [coffeejtrue  says: 

no  matter  how  we  do  a  ‘coffee’  action,  we  must  end  up  m  a 
state  m  which  true  is  true. 

This  property  will  always  be  true  -  regardless  of  whether  or  not  we  can 
do  a  ‘coffee’  action  -  as  true  will  of  course  be  true  in  any  subsequent 
state.  This  property  is  therefore  equivalent  to  the  property  true. 

4.  [cof fee] false  says: 

no  matter  how  we  do  a  ‘coffee’  action,  we  must  end  up  m  a 
state  in  which  false  is  true. 

A  ‘coffee’  action  must  therefore  not  be  possible,  since  false  can  never 
be  true  in  any  subsequent  state.  This  formula  thus  says  the  same  thing 
as  -i(cof f ee)true. 

Exercise  13.2  (page  336) 

-i{a)  (a)true. 

In  words,  this  says  that  it  is  not  the  case  that  I  can  do  an  ‘a’  action  and  get 
into  a  state  in  which  I  can  do  another  ‘a’  action. 
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Exercise  13.3  (page  336) 

[tick]  (tick)true. 

In  words,  this  says  that  no  matter  how  I  do  a  ‘tick’  action,  I  must  end  up 
in  a  state  in  which  I  can  do  another  ‘tick’  action.  This  is  true  of  the  clock 
Cl  but  not  of  the  clock  Cl*,  as  the  latter  clock  may  stop  after  just  one  tick. 

Exercise  13.4  (page  340) 

1,  4,  5,  6,  7  and  10  are  valid,  whereas  2,  3,  8,  9,  11  and  12  are  not  valid. 

Exercise  13.5  (page  341) 

1.  (pull) {pult) (break) true. 

This  is  true  only  of  the  state  On. 

2.  (pult)  (pult)  (reset) true. 

This  is  true  only  of  the  state  Broken. 

3.  -i(pull)true. 

This  is  not  true  of  any  state;  you  can  do  a  ‘pull’  action  from  any  state. 

4.  (pull) true  A  ->(break)true  A  -i{ reset) true. 

This  is  true  only  of  the  state  Off. 

Note  that  this  assumes  that  the  only  actions  available  of  the  process 
are  ‘puli',  ‘break’  and  ‘reset’.  We  need  to  include  a  conjunct  -i(a)true 
for  every  action  a  pull  to  explicitly  disallow  the  possibility  of  such 
an  action  ‘a’  being  possible. 

Exercise  13.6  (page  342) 

Fact:  -i[a]P  o  (a)^P 

Proof:  E\=^[a]P  o  E  \/=  [a]P 

<>  -iVF  (fiAf  =4>  F  P ) 

O  3 Fn(flA  F  =>  F  \=  P) 

O  3 F(E  F  A  F^P) 

<>  3F(E  A  F  A  F  t.  ■  P) 

O  E  |=  (a)iP  □ 

Exercise  13.7  (page  343) 


Theorem  13.6:  For  any  process  E  and  any  property  P  of  HML: 
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1.  E  \=  pos (P)  if,  and  only  if,  E  \=  P;  and 

2.  E  |=  neg(P)  if,  and  only  if,  E[f  P. 

Proof:  By  induction  on  the  structure  of  P.  That  is,  we  demonstrate  that 

1.  E  pos(P)  if,  and  only  if,  E  \=  P;  and 

2.  E  \=  neg(P)  if,  and  only  if,  E[f  P 

under  the  assumption  that,  for  any  process  F  and  any  property  Q  smaller 
than  P, 

1.  F  j==  pos(Q)  if,  and  only  if,  F  |=  Q;  and 

2.  F  \=  neg(Q)  if,  and  only  if,  F\£Q. 

We  thus  argue  by  cases  on  the  structure  of  P : 

P  =  true: 

1.  E  \=  pos(true) 

E  |=  true  [by  definition  of  pos  (true)] 

2 .  E  \=  neg(true) 

O  E  J=  false  [by  definition  of  neg(true)] 

O  true  [by  semantic  definition  for  true  and  false ] 

P  =  false: 

1.  E  \=  pos(false) 

O  E  |=  false  [by  definition  of  pos  (false)] 

2.  E  |=  neg(false) 

E  |=  true  [by  definition  of  neg(false)] 

O  false  [by  semantic  definition  for  true  and  false ] 

P  =  ~1Q: 

1-  E  \=  pos (-iQ) 

O  E  |=  neg(Q)  [by  definition  of  pos(^Q)] 

O  E  Q  [by  induction  hypothesis  2] 

<  >  E  L  ~^Q  [by  semantic  definition  for  ->] 

2.  E  \=  neg(^Q) 

O  E  |=  pos(Q)  [by  definition  of  neg(^Q)] 

O  E  \=  Q  [by  induction  hypothesis  1 ] 

O  E[f:  -iQ  [by  semantic  definition  for  ->] 
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P  —  Q i  A  Qo,: 


1  •  E\=  pos(Q1  A  Q2) 

O  E  \=  pos(Qi)  A  pos(Q2)  [by  definition  of  pos(Qi  A  Q2)\ 

&  E  \=  posfQj)  and  E  1=  pos(Q2)  [by  semantic  definition  for  A] 


O  E\=Qi  and  E  \=  Q2 
O  E  |=  Qi  A  Q2 

2.  E  \=  neg(Qi  A  Q2) 

O  E  [=  neg(Qi)  V  neg(Q2) 

E  \=  neg(Qi)  or  E  |=  neg(Q2 
«  B[^  Qj  or  B|^  Q2 
O  ^(B  |=  Qi  and  E  |=  Q2) 

O  Qi  A  Q2 


[fey  induction  hypothesis  1 ] 

[fey  semantic  definition  for  A] 

[fct/  definition  of  neg(Q!  A  Q2)} 
[by  semantic  definition  for  V] 
[by  induction  hypothesis  2 ] 

[by  De  Morgan's  Law ] 

[by  semantic  definition  for  A] 


P  —  Q i  V  Q2: 

1-  E  \=  pos(Qi  V  Q2) 

<>  POs(Qi)  V  pos(Q2) 

<=>  B  |=  pos(Qi)  or  E  [=  pos(Q2) 
E\=Q1  or  E  \=  Q2 
E  |=  Qi  V  Q2 

2.  E  \=  neg (Qi  V  Q2) 

<=>  B  |=  neg(Qi)  A  neg(Q2) 

<  >  B  L;  neg(Qi)  and  E  |=  nej 
E  Qj.  and  E  \£  Q2 
O  -i(B  1=  Qi  or  E  |=  Q2) 

•£>  B[#  Qi  V  Q2 


[fey  definition  of  pos(<3!  V  Q2)] 
[by  semantic  definition  for  V] 
[fey  induction  hypothesis  1 ] 

[fey  semantic  definition  for  V] 

[fey  definition  of  neg(Q!  Vft)] 
'2)  [fey  semantic  definition  for  A] 
[fey  induction  hypothesis  2] 

[by  De  Morgan’s  Law ] 

[fey  semantic  definition  for  V] 


B  =  (o)Q: 


1-  B  |=  pos({a)Q) 
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O  B|=  {a)pos(Q)  [by  definition  of  pos({a)Q)] 

<  >  F  i-f.  pos (Q)  for  some  F  such  that  E  -A  F 

[by  semantic  definition  for  (a)] 
O  F  |=  Q  for  some  F  such  that  E  A  F 

[by  induction  hypothesis  1 ] 

E  \=  { a)Q  [by  semantic  definition  for  (a)] 

2.  E  |=  neg({a)Q) 

<•>  F,  r-  [a]neg(Q)  [by  definition  of  neg({a)Q)] 

O  F|=  neg(Q)  for  all  F  such  that  E  -A  F 

[by  semantic  definition  for  [a]] 
O  F  Q  for  all  F  such  that  E  A  F 

[by  induction  hypothesis  2] 

O  E\jt  ( a)Q  [by  semantic  definition  for  (a)] 

P  =  [a]Q: 

1.  E\=  pos([a]Q) 

O  E  \=  ja]pos(Q)  [by  definition  of  pos([a]Q)] 

O  F  |=  pos(Q)  for  all  F  such  that  E  A  F 

[by  semantic  definition  for  [a]] 
O  F  |=  Q  for  all  F  such  that  E  -A  F 

[by  induction  hypothesis  1 ] 

O  E  \=  [a]Q  [by  semantic  definition  for  [a]] 

2-  E  |=  neg([a]Q) 

&  E  \=  (a)neg(Q)  [by  definition  of  neg(  a  O)' 

O  F  |=  neg(Q)  for  some  F  such  that  E  A  F 

[by  semantic  definition  for  {a}] 
O  F  Q  for  some  F  such  that  E  -A  F 

[by  induction  hypothesis  2] 

<=>  E\f  [a]Q  [by  semantic  definition  for  [a]] 
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Exercise  13.8  (page  343) 

Fact:  For  all  modal  properties  P,  neg(neg(P))  =  P. 

Proof:  By  induction  on  the  structure  of  P,  arguing  by  cases  on  the  struc¬ 
ture  of  P. 

P  =  true:  neg(neg(true))  =  neg(false)  =  true. 

P  =  false:  neg(neg(false))  =  neg(true)  =  false. 

P  =  Qi  A  Qo:  By  the  inductive  hypothesis  we  assume  that  neg(neg(Q!))  = 
Qi  and  neg(neg(Q2))  =  Q2 • 

Then  neg(neg(Q1  A  Q2))  =  neg(neg(Q1)  V  neg(Q2))  =  neg(neg(Q1))  A 
neg(neg(Q2))  =  Qi  A  Q2 

P  =  0i  V  02:  By  the  inductive  hypothesis  we  assume  that  neg(neg(Qi))  = 
Qi  and  neg(neg(Q2))  =  Q2. 

Then  neg(neg(Q1  V  Q2))  -  neg(neg(Q1)  A  neg(Q2))  =  neg(neg(Q1))  V 
neg(neg (Q2))  =  Qi  V  Q2 

P  =  (a)Q:  By  the  inductive  hypothesis  we  assume  that  neg(neg(Q))  =  Q. 
Then  neg(neg«a)Q))  =  neg([a]neg(Q))  =  (a)neg(neg(Q))  =  (a)Q 

P  =  [a]Q:  By  the  inductive  hypothesis  we  assume  that  neg(neg(Q))  =  Q. 
Then  neg(neg([a]Q))  =  neg((a)neg(Q))  =  [a]neg(neg(Q))  =  [a] 


Exercise  13.9  (page  345) 

The  properties  distinguishing  between  C  and  D  were  presented  informally  in 
the  solution  to  Exercise  11.16(b).  We  need  simply  express  these  properties 
in  the  language  of  HML. 

•  From  state  C  you  may  do  an  ‘a’  action  and  be  in  a  state  in  which,  no 
matter  how  you  do  a  ‘b’  action  you  will  either  not  be  able  to  do  a  ‘c’ 
action  or  you  will  not  be  able  to  do  a  ‘d’  action.  Formally: 


C  \=  (a)[h]([c]false  V  [djfalse) 


»  □ 
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•  On  the  other  hand,  from  state  D  no  matter  how  you  do  an  ‘a’  action, 
you  will  be  able  to  do  a  ‘b’  action  and  end  up  in  a  state  in  which  you 
can  both  do  a  ‘c’  action  as  well  as  a  ‘d’  action.  Formally: 

D  |=  [a]  (6) ((c) true  A  (d)true) 

Note  that  these  properties  are,  naturally,  the  negations  of  each  other: 
D  =  neg(C)  and  C  =  neg(Z)). 

Exercise  13.11  (page  350) 

1.  Consider  the  formula 

(a)true  A  [— ajfalse  A  [— ][— ]false. 

Clearly  this  characterises  the  process  a.O: 

•  The  first  conjunct  says  that  it  is  possible  to  do  an  a  transition; 

•  The  second  conjunction  says  that  it  is  not  possible  to  do  anything 
other  than  an  a  transition. 

•  The  final  conjunct  says  that  it  is  not  possible  to  do  two  transi¬ 
tions. 

2.  The  characteristic  formula  for  a. (6.0  +  c.O)  is 

(a)true  A  [— ajfalse  A  [a]({6)true  A  (c)true  A  [— ][— ]false). 

Exercise  13.12  (page  353) 


1. 

|  (a)true||  = 

{E,  Eu  F} 

2. 

||  <6)true||  = 

{Ei,  E2  } 

3. 

|  (a)(a)true| 

=  {E,  Ei} 

4. 

||(6)(6)true|| 

=  0 

5. 

1 1  <a)  [a]  fa  lse|  | 

= 

6. 

||[6]<a)true|| 

=  {E,  Ei,  E2 

Chapter  14 

Exercise  14.2  (page  360) 

Let  A  =f  0  with  Sort(j4)  =  0,  and  B  =f  0  with  Sort(B)  =  {a}. 

Then  clearly  A  ~  B  although  Sort(j4)  yt  Sort(B). 

If  we  let  X  =f  a.O  with  Sort(X)  =  0,  then  A  ||  X  ~  a.O,  but  B  ||  X  ~  0. 
Thus,  A  ~  B  whereas  A  |j  X  ^  B  |j  X . 


484  Temporal  Properties 


Exercise  14.3  (page  362) 

The  relevant  bisimulation  relation  is 

{(C2,  C  ||  C),  (Cl,,  dec. C  |j  C),  (C!,,  C  ||  dec. C),  (C",  dec. C  ||  dec. C)  }. 

Exercise  14.4  (page  365) 

The  safety  property  holds:  a  car  may  cross  only  if  the  barrier  is  up,  and  a 
train  may  cross  only  if  the  signal  is  green;  and  the  controller  ensures  that 
the  barrier  is  never  up  at  the  same  time  that  the  signal  is  green  by  raising 
the  barrier  only  when  the  signal  is  red  and  turning  the  signal  green  only 
when  the  barrier  is  down. 

The  liveness  properties,  however,  fail  to  hold  as  given.  When  a  car 
arrives,  it  is  not  necessarily  the  case  that  the  barrier  will  eventually  go  up. 
It  may  be  the  case  that  an  endless  stream  of  trains  arrive,  and  that  the 
controller  repeatedly  turns  the  signal  green  to  allow  each  of  these  trains  to 
cross  the  intersection  without  ever  raising  the  barrier  to  allow  the  waiting 
car  to  pass.  Equally,  the  controller  may  allow  an  endless  stream  of  cars  to 
pass,  never  changing  the  signal  to  green  to  allow  a  waiting  train  to  pass. 

These  liveness  properties  can  be  weakened  to  read: 

•  If  a  car  arrives,  eventually  the  barrier  may  go  up. 

•  If  a  train  arrives,  eventually  the  signal  may  turn  green. 

These  weakened  properties  do  hold  of  the  system. 

In  reality,  a  barrier  typically  remains  up,  to  allow  cars  to  cross  the  inter¬ 
section  freely,  until  a  train  arrives;  the  arrival  of  a  train  signals  the  controller, 
which  then  lowers  the  barrier,  then  turns  the  signal  to  green,  then  turns  the 
signal  to  red  again,  and  finally  raises  the  barrier  once  again.  If  the  compo¬ 
nents  are  built  correctly  following  this  protocol,  then  the  original  liveness 
properties  will  hold,  along  with  the  safety  properties. 

Exercise  14.5  (page  368) 

The  only  way  that  the  system  can  deadlock  is  if  every  philosopher  is  wanting 
to  pick  up  a  fork  which  is  not  available.  (No  philosopher  would  ever  be 
hindered  from  eating  nor  from  setting  a  fork  down  on  the  table.)  No  two 
philosophers  can  be  wanting  to  pick  up  the  same  fork,  as  each  one  of  them 
must  be  prevented  from  picking  it  up  by  the  other  already  holding  it.  Since 
each  philosopher  is  stopped  by  the  absence  of  a  different  fork,  every  fork 
must  be  in  the  hand  of  some  philosopher,  and  thus  each  philosopher  must 
be  in  the  state  of  having  just  picked  up  their  first  fork.  But  that  would  mean 
that  philosophers  1  and  2  are  both  holding  fork  2,  which  is  impossible. 
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Exercise  14.6  (page  371) 

We  argue  that  if  the  first  process  reaches  the  state  where  it  is  ready  to 
enter  the  critical  section,  then  the  second  process  will  not  be  able  to  reach 
the  analogous  state  until  the  first  process  enters  and  then  exist  the  critical 
section.  A  symmetric  argument  shows  that  the  second  process  being  in  the 
critical  section  prevents  the  first  from  also  being  so. 

•  When  the  first  process  becomes  ready  to  enter  the  critical  section  (ie, 
enters  state  R2),  then  the  bl  process  must  be  in  the  state  Bit,  and 
either  the  b2  process  is  in  the  state  B2f  or  the  k  processor  is  in  the 
state  Ki. 

•  Before  the  first  process  enters  and  exists  the  critical  section,  if  the 
second  process  is  waiting  to  be  allowed  to  enter  the  critical  section  (ie, 
is  in  state  W2),  then  the  b2  process  must  be  in  state  B2t.  Hence  (from 
above)  the  k  processor  is  in  the  state  Ki.  Thus  this  process  will  not 
be  able  to  move  to  state  R2  and  enter  the  critical  section. 

Exercise  14.7  (page  373) 

The  enhanced  message-passing  protocol  requires  no  change  to  the  Sender, 
only  to  the  Receiver  and  the  Medium.  The  acknowledgement  that  the 
Sender  is  awaiting  will  come  from  the  Medium  rather  than  directly  from 
the  Receiver,  but  this  difference  is  not  noticeable  from  the  point  of  view 
of  the  Sender.  Thus  its  definition  remains  unchanged: 

def  def 

Sender  =  in.snd.S  S  =  ack.Sender  +  err.snd.S 

Sort(Sender)  =  {snd,  ack,  err} 

Again,  its  transition  system  is  depicted  thus: 


in 


The  enhanced  Receiver  must  cater  for  the  possibility  of  its  acknowl¬ 
edgement  being  lost.  After  receiving  a  message  (via  the  “rev”  action)  and 
forwarding  it  on  (via  the  “out”  action),  it  will  issue  an  auxiliary  acknowl¬ 
edgement  to  the  Medium  (via  a  “rack”  action).  At  this  point  it  will  be  ready 
to  receive  a  new  message.  However,  it  may  instead  receive  an  auxiliary  er¬ 
ror  message  from  the  Medium  (modelled  by  a  “rerr”  action),  indicating 
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that  the  acknowledgement  was  lost,  in  which  case  it  will  retransmit  this 
acknowledgement.  The  new  definition  is  as  follows: 

def 

Receiver  =  rev. out. rack. Receiver  +  rerr.  rack.  Receiver 
Sort(Receiver)  =  {rev,  rack,  rerr} 

Its  transition  system  is  depicted  thus: 


rev 


The  behaviour  of  the  Medium  must  now  interact  with  the  Receiver 
in  delivering  the  acknowledgement  from  the  Receiver  to  the  Sender  of 
the  safe  arrival  of  the  message  being  delivered.  After  passing  the  message 
to  the  Receiver  (via  the  “rev”  action),  the  Medium  awaits  the  auxiliary 
acknowledgement  from  the  Receiver  (modelled  by  the  “rack”  action).  It 
then  either  passes  the  acknowledgement  along  to  the  Sender  (via  the  “ack” 
action);  or  it  may  lose  the  acknowledgement  (modelled  by  a  “rerr”  action), 
and  await  a  new  acknowledgement  from  the  Receiver.  The  new  definition 
is  as  follows: 

Medium  =f  snd.(rcv.rack.M  +  err. Medium) 
def 

M  =  ack. Medium  +  rerr.rack.M 
Sort(Medium)  =  {snd,  rev,  err} 

Its  transition  system  is  depicted  thus: 
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in 


out 


Figure  15.7:  The  enhanced  message-passing  system. 


The  symmetry  reflected  in  the  transition  diagram  makes  clear  the  similarity 
in  the  manner  that  the  Medium  treats  the  Sender  and  Receiver. 

The  complete  system  is  again  defined  to  be  the  composition  of  these 
three  components: 

System  =f  Sender  ||  Medium  ||  Receiver 
but  now  the  following  configuration: 


The  behaviour  of  the  complete  enhanced  system  is  thus  depicted  by  the  tran¬ 
sition  system  depicted  in  Figure  15.7.  The  symmetry  between  the  Sender 
and  the  Receiver  is  immediately  noticeable  in  this  transition  system. 
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Exercise  14.8  (page  377) 

•  From  the  state  Si  ||  M  ||  P,  (the  initial  state  being  50  ||  M  |j  R0)  the 
system  can  do  an  in  action  and  nothing  else,  leaving  the  system  in  the 
state  S'  ||  M  ||  P,. 

•  Prom  here,  the  system  will  not  be  able  to  do  a  further  in  action  until 
the  Sender  process  reaches  the  state  S^. 

•  This  can  only  happen  after  the  Sender  action  synchronises  with  the 
Medium  on  an  ack;  action,  leaving  the  Sender  in  the  state  S^  and  the 
Medium  in  the  state  M. 

•  Until  this  ack;  synchronisation  occurs,  the  Sender  can  repeatedly  al¬ 
ternate  between  the  actions  s,  and  t,  so  the  system  will  not  deadlock. 

•  The  Medium  can  only  do  this  ack,  action  after  it  synchronises  with  the 
Receiver  on  a  rack;  action. 

•  The  Receiver  in  turn  can  only  do  this  rack,  action  after  doing  an  out 
action,  and  then  leaves  the  Receiver  in  the  state  R\-i. 

•  The  system  will  then  be  in  the  state  Si_i  ||  M  |]  from  which  the 
above  argument  applies. 


Chapter  15 


Exercise  15.2  (page  384) 


Deadlock-free  = 


□  (— )true 

)true 

— '<>[—]  false 
^Deadlockable 


(by  definition) 

(since  DP  =  ^0 -iP) 
(since  )P  =  [— ]-iP) 
(by  definition) 


Exercise  15.3  (page  385) 

Ev  P  asserts  that  P  must  eventually  become  true. 

This  is  almost  the  same  as  Q  U  P  which  also  asserts  that  P  must  even¬ 
tually  become  true;  the  only  difference  is  the  added  requirement  that  until 
P  becomes  true,  Q  must  remain  true. 

However,  this  added  requirement  is  vacuous  if  we  take  the  property  Q 
to  be  true,  as  of  course  true  is  always  true  anyways. 

Hence,  EvP  =  true  UP. 
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Exercise  15.4  (page  386) 

Fact:  P  |=v  P  if,  and  only  if,  E  G  ||P|jv. 

Proof:  By  induction  on  the  structure  of  P,  arguing  by  cases  on  the  struc¬ 
ture  of  P. 


P  = 
P  = 
P  - 


true:  E  |=v  true  o  E  G  States  o  E  6  |jtrue| 


v[rnS] 


false:  E  |=v  false  <>  E  G  0  <>  E  G  |!false||v[XrtS] 

JL:  E^X  <>  BeV(I)  <>  PG||X|jv[XrtS] 


P  =  zlP:  E\=w^P  o  P^vP 

<=>  E  £  ||-P|lv[XrtS] 

E  6  ll-P|lv[XrtS]  ^  E  G  1 1  'P3 1  VpCrtS] 

P  =  Q-i  A  CP:  E  |=v  <3i  A  Q2  o  E  |=v  <3i  and  E  |=v  C?2 

O  F  G  HQillvpfrtS]  anc^  E  G  IIQallvjXrtS] 
°  E  G  ||Qi|lv[x«s]  n  IIQsllvpfrtS] 

°  E  G  ||Ql  A  Q2||V[XmS] 

P  =  CP  V  CP:  E  \=\,  Qi  V  Q2  G=>  E  |=v  Qi  or  E  |=v  C?2 

O  F  G  ||Qi|lv[x«s]  or  E  G  II^HvpcwS] 

E  G  ||Qi|lv[XrtS]  u  IIQaljvpCrtS] 

0  E  G  ||Qi  V  CPlIvp^s] 

P  =  (a)Q:  E  |=v  (a)Q  o  E  A  E1  such  that  E'  |=v  Q 

O  E  A  P'  such  that  P'  G  ||Q||V[xms] 

O  P  G  ||  (a)Qllv[x«s] 

P  =  [a]Q:  P  |=v  [a]Q  o  P  A  P'  implies  P'  |=v  <3 

o  P  A  P'  implies  P'  G  ||QIIV[x«s] 

O  P  G  || MC?llvpc«s]  r 


Exercise  15.5  (page  388) 

IKc^XUyp^j  =  {P  G  States  :  P  A  P'  for  some  P'  G  0}  =  0. 
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Exercise  15.6  (page  389) 

By  Exercise  15.5,  the  empty  set  5  =  0  satisfies  5  —  |[,(a)X ||V[Xrt*  and  hence 
must  be  the  least  fixed  point  of  the  function  /(5)  =  IKa)^llv[x«s]- 

Let  A  —  {  E  e  States  :  S  A  ■  A  ■  A  ■  ■  ■  •  }  be  the  set  of  states 
which  we  intended  to  capture  in  Example  15.4  with  the  recursive  property 
X  =  ( a)X .  As  demonstrated  in  Example  15.4,  this  set  is  a  fixed  point;  we 
shall  demonstrate  that  A  must  in  fact  be  the  greatest  fixed  point. 

To  this  end,  suppose  that  5  is  any  fixed  point: 

5  =  f(S)  (a)X  v.*.r;Sr 

=  {E  e  States  :  E  -A  E'  for  some  E1  e  S} 

and  suppose  further  that  E  e  S.  We  need  to  show  that  E  e  A. 

•  Since  E  e  S ,  E  -A  E'  for  some  E'  e  S. 

•  Since  E'  e  5,  E'  -A  E"  for  some  E"  e  S. 

•  Since  E"  e  S,  E"  -A  E'"  for  some  E'"-  e  S. 

Continuing  in  this  fashion,  it  becomes  clear  that  E  e  S. 

As  for  a  fixed  point  of  the  function  f(S )  =  ||{a)A'||Vjx^>Sj  which  is  neither 
the  least  nor  greatest  fixed  point,  consider  the  process  with  two  states  A  and 
B  and  two  transitions  A  -A  A  and  B  -A  A.  Then  0,  {A}  and  {A,  B}  are 
all  fixed  points  of  this  function. 


Exercise  15.7  (page  392) 

We  prove  this  by  induction  -  and  arguing  by  cases  -  on  the  structure  of  P. 
However,  we  only  present  the  three  cases  which  don’t  appear  in  the  proof 
of  the  analogous  result  for  HML  (Theorem  13.6,  page  343). 


P  =  X: 

E^vneg(X)  o  E\=VX  o  E  e  V(X)  o  E  ^  V(X)  o  E&sj  X 


P  =  uX.O: 

E  [=v  neg(/rXQ) 

O  35  C  States  : 
O  35  C  States  : 
e  35C  States  : 
4=>  -S  bA  pX.Q 


<=>  E  \=vvX.neg(Q) 
E  e  S  and  VF  e  5  : 
E  i  S  and  VF  ^  5  : 
E  £  S  and  VF  ^  5  : 


F  Nv[x«s]  neg(Q) 

E  NvpoTsJ  neS(*3) 
F&v{x>->s]  Q 


P  =  i/X.Q: 
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E  Nv  neg  (vX.Q)  fl  /iX.neg(Q) 

O  VS  C  States  :  if  E  £  S  then  3 F  ^  S  such  that  F  |=v[x«s]  neg((3) 

e  VS  C  States  :  if  E  e  S  then  3 F  e  S  such  that  F  |=V[x^s]  nes(<3) 

O  VS  C  States  :  if  E  e  S  then  3 F  e  S  such  that  F  |^vpth->s]  Q 

O  EfavX.Q  □ 

Exercise  15.12  (page  399) 

1.  With  a  least  fixed  point,  we  cannot  be  allowed  to  unroll  the  recursive 
equation  infinitely  often  in  verifying  that  the  property  P  is  true  in 
every  state. 

At  every  state  we  reach,  the  property  P  must  hold.  But  we  must 
eventually  have  nowhere  to  go;  that  it,  the  process  must  eventually 
deadlock. 

Thus  this  property  is  true  as  long  as  P  is  true  in  every  state  of  the 
process  and  every  run  of  the  process  deadlocks. 

2.  With  a  greatest  fixed  point,  we  are  allowed  to  unroll  the  recursive 
property  infinitely  often  in  our  search  for  a  state  in  which  P  is  true. 

At  each  state,  either  the  property  P  must  hold,  or  it  must  be  possible 
to  make  a  transition  and  continue  the  search  for  a  state  in  which  P 
holds;  however,  we  need  never  complete  this  search. 

Thus  this  property  is  true  if  P  is  true  in  some  state,  or  if  there  is  an 
infinite  path  through  the  process. 

3.  With  a  greatest  fixed  point,  we  are  allowed  to  unroll  the  recursive 
property  infinitely  often  in  our  search  for  a  state  in  which  Q  is  true. 

Thus  the  property  is  true  if  P  is  true  for  as  long  as  Q  is  not  true,  but 
until  Q  becomes  true  -  if  ever  -  it  must  be  possible  to  do  something. 

Exercise  15.13  (page  401) 

1.  P  almost  always  holds  along  some  au  path. 

In  order  for  this  property  to  hold,  there  must  be  a  state  reachable  by 
a  sequence  of  a  transitions  from  which  an  au  path  exists  along  which 
P  is  always  true. 

We  have  already  seen  how  to  express  the  property  that  P  always  holds 
along  some  au  path: 

<f>  =  vX.P  A  (a)X. 

We  need  only  find  a  state  satisfying  this  property  which  can  be  reached 
by  a  sequence  of  a-transitions: 
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liX.$  V  (a)X. 

Writing  this  out  in  full  by  substituting  in  the  formula  for  $  -  whilst 
at  the  same  time  changing  one  of  the  variables  to  avoid  confusion  -  we 
get  the  following: 

/ jZ.(vX.P  A  <a)x)  V  (a)Z . 

2.  P  holds  infinitely  often  along  some  au  path. 

In  order  for  this  property  to  be  true,  we  must  be  able  to  reach  a  state 
by  doing  a  sequence  of  a  transitions  in  which  P  holds,  and  then  to 
repeat  this  forever. 

We  will,  therefore,  have  a  least  fixed  point  construction  -  to  allow  us 
to  look  for  the  state  in  which  P  holds  -  embedded  within  a  greatest 
fixed  point  construction  -  to  allow  us  to  repeat  this  search  over  and 
over  again  forever. 

vZ.fj,X.(P  A  (a)Z)  V  (a)X. 
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